Encoding method, decoding method, encoder, decoder, and storage medium
By storing the filter-related parameters in the adaptive parameter set (APS), the problem of wasted cache resources in multi-functional video coding by loop filters is solved, and cache resources are saved.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP LTD
- Filing Date
- 2024-12-23
- Publication Date
- 2026-07-02
AI Technical Summary
In multi-functional video coding, the updating and parsing of the filtering-related parameters of the loop filter consumes a large amount of cache resources, resulting in a waste of cache resources.
Storing filter-related parameters in an adaptive parameter set (APS) instead of at the image or slice level reduces cache usage by updating them only when needed.
By storing filter-related parameters in the APS, the cache requirements at the slice and image levels are reduced, saving cache resources.
Smart Images

Figure CN2024141619_02072026_PF_FP_ABST
Abstract
Description
Encoding / decoding methods, encoders, decoders, and storage media Technical Field
[0001] This application relates to the field of video encoding and decoding technology, and in particular to an encoding and decoding method, encoder, decoder, and storage medium. Background Technology
[0002] In Versatile Video Coding (VVC) or Enhanced Compression Model (ECM), in-loop filters encompass various types of filters. Encoding and decoding the filter-related parameters of these filters requires corresponding buffers to handle the storage of the parsed coefficients, and the updating and parsing of these filter-related parameters consumes significant buffer resources. Summary of the Invention
[0003] In a first aspect, embodiments of this application provide a decoding method applied to a decoder. The method includes: obtaining filter correlation parameters of the current block from the APS, wherein the filter correlation parameters are filter correlation parameters of a filter based on a reference position; wherein the reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image; determining the filter coefficients of the filter of the current block according to the filter correlation parameters of the current block; and filtering the reconstructed value of the current position according to the reference position of the current position and the filter coefficients of the filter of the current block.
[0004] Secondly, embodiments of this application provide an encoding method applied to an encoder. The method includes: determining filtering correlation parameters of the current block, wherein the filtering correlation parameters are filtering correlation parameters of a filter based on a reference position; wherein the reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image; determining the filtering coefficients of the filter of the current block according to the filtering correlation parameters of the current block; and filtering the reconstructed value of the current position according to the reference position of the current position and the filtering coefficients of the filter of the current block.
[0005] Thirdly, embodiments of this application provide an encoder, which includes a first determining unit and a first filtering unit, wherein: the first determining unit is configured to determine filtering-related parameters of the current block, the filtering-related parameters being filtering-related parameters of a filter based on a reference position; wherein the reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image; the first determining unit is further configured to determine the filtering coefficients of the filter of the current block according to the filtering-related parameters of the current block; the first filtering unit is configured to filter the reconstructed value of the current position according to the reference position of the current position and the filtering coefficients of the filter of the current block.
[0006] Fourthly, embodiments of this application provide an encoder, which includes a first memory and a first processor, wherein: the first memory is used to store a computer program that can run on the first processor; and the first processor is used to execute the method described in the second aspect when running the computer program.
[0007] Fifthly, embodiments of this application provide a decoder, which includes a first acquisition unit, a second determination unit, and a second filtering unit, wherein: the first acquisition unit is configured to acquire filtering correlation parameters of the current block from the APS, the filtering correlation parameters being filtering correlation parameters of a filter based on a reference position; wherein the reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image; the second determination unit is configured to determine the filtering coefficients of the filter of the current block according to the filtering correlation parameters of the current block; the second filtering unit is configured to filter the reconstructed value of the current position according to the reference position of the current position and the filtering coefficients of the filter of the current block.
[0008] In a sixth aspect, embodiments of this application provide a decoder, which includes a second memory and a second processor, wherein: the second memory is used to store a computer program that can run on the second processor; and the second processor is used to execute the method described in the first aspect when running the computer program.
[0009] In a seventh aspect, embodiments of this application provide a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the method described in the first aspect or the method described in the second aspect.
[0010] Eighthly, embodiments of this application provide a computer program product, including a computer program or instructions that, when executed by a processor, implement the method described in the first aspect or the method described in the second aspect.
[0011] In a ninth aspect, embodiments of this application provide a computer-readable storage medium having a bitstream stored thereon, the bitstream being generated by performing the steps of the encoding method as described in the second aspect.
[0012] It is understood that in the embodiments of this application, the filter-related parameters of the filter based on the reference position are placed in the APS rather than at the image level or slice level. This is because, regardless of whether these filter-related parameters are encoded or decoded at the image level or slice level, a corresponding cache is needed to handle the storage of the parsed coefficients. However, the inventors of this application have found that the updates and parsing of these filter-related parameters are not actually that frequent and do not need to be updated at the image level or slice level. Therefore, placing these filter-related parameters in the APS reduces the cache size at the slice level and image level, thus saving cache resources. Attached Figure Description
[0013] Figure 1 is a schematic diagram of a one-way filter provided in an embodiment of this application;
[0014] Figure 2 is a schematic diagram of unidirectional filtering provided in an embodiment of this application;
[0015] Figure 3 is a schematic diagram of bidirectional filtering provided in an embodiment of this application;
[0016] Figure 4 is a schematic diagram of the bidirectional filtering proposed in the embodiment of this application;
[0017] Figure 5 is a schematic diagram of bidirectional filtering provided in an embodiment of this application;
[0018] Figure 6 is a schematic diagram showing the shapes of the two filters used in the bidirectional filtering provided in the embodiment of this application.
[0019] Figure 7 is a schematic diagram showing the shapes of the two filters used in the bidirectional filtering provided in the embodiment of this application.
[0020] Figure 8 is a schematic diagram of the shape of a filter used in a unidirectional filter provided in an embodiment of this application;
[0021] Figure 9 is a schematic diagram of the shape of a filter used in a unidirectional filter provided in an embodiment of this application.
[0022] Figure 10 is a schematic diagram of a video encoding and decoding network architecture provided in an embodiment of this application;
[0023] Figure 11 is a schematic block diagram of the system composition of an encoder provided in an embodiment of this application;
[0024] Figure 12 is a schematic block diagram of a decoder system provided in an embodiment of this application;
[0025] Figure 13 is a schematic diagram of the implementation flow of the decoding method provided in the embodiment of this application;
[0026] Figure 14 is a schematic diagram of a further implementation process of step 1301 provided in an embodiment of this application;
[0027] Figure 15 is a schematic diagram of the implementation flow of the encoding method provided in the embodiment of this application;
[0028] Figure 16 is a schematic diagram of two filter shapes that are suitable for both unidirectional and bidirectional filtering, provided in the embodiments of this application;
[0029] Figure 17 is a schematic diagram of the composition structure of the encoder provided in the embodiment of this application;
[0030] Figure 18 is a schematic diagram of the hardware structure of the encoder provided in an embodiment of this application;
[0031] Figure 19 is a schematic diagram of the composition structure of the decoder provided in the embodiment of this application;
[0032] Figure 20 is a schematic diagram of the hardware structure of the decoder provided in an embodiment of this application;
[0033] Figure 21 is a schematic diagram of the composition structure of an encoding / decoding system provided in an embodiment of this application. Detailed Implementation
[0034] In order to gain a more detailed understanding of the features and technical content of the embodiments of this application, the implementation of the embodiments of this application will be described in detail below with reference to the accompanying drawings. The accompanying drawings are for reference and illustration only and are not intended to limit the embodiments of this application.
[0035] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of this application only and is not intended to limit this application.
[0036] In the following description, references are made to “some embodiments,” which describe a subset of all possible embodiments. However, it is understood that “some embodiments” may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict.
[0037] It should also be noted that the terms "first, second, and third" used in the embodiments of this application are only used to distinguish similar objects and do not represent a specific order of objects. It is understood that "first, second, and third" can be interchanged in a specific order or sequence where permitted, so that the embodiments of this application described herein can be implemented in an order other than that illustrated or described herein.
[0038] In video images, a first color component, a second color component, and a third color component are generally used to represent a coding block (CB). These three color components are a luma component, a blue chroma component, and a red chroma component, respectively. Specifically, the luma component is usually represented by the symbol Y, the blue chroma component is usually represented by the symbol Cb or U, and the red chroma component is usually represented by the symbol Cr or V. Thus, video images can be represented in YCbCr format or YUV format.
[0039] Before providing a further detailed description of the embodiments of this application, the nouns and terms that may be involved in the embodiments of this application will be explained first. The nouns and terms involved in the embodiments of this application are subject to the following interpretations.
[0040] H.266 / Versatile Video Coding (VVC); VVC Test Model (VTM); Enhanced Compression Model (ECM); Joint Video Experts Team (JVET); Motion vector (Mv); Intra Block Copy (IBC); Block Vector (BV); Adaptive Parameter Set (APS); Reference Picture List (RPL); For bidirectional inter-coded frames (B-frames), there are two reference picture lists; for unidirectional inter-coded frames, there is one reference picture list; Temporal Filter (TF); Sample Adaptive Offset (SAO); Coding Unit (CU); Prediction Unit (PU); Transform Unit (TU); Coding Tree. Unit (CTU); Largest Coding Unit (LCU).
[0041] It should be understood that the technique of adaptive filtering using a reference image involves finding a reference position by using the Mv or BV used at the current filtering position, or by using a position in the reference image with the same coordinates as the current filtering position as the reference position, and using the reconstructed pixels of the reference position as the input information source for the filter to generate a filtered value. This filtered value is then applied to the current filtering position to correct the reconstructed value at the current filtering position. In this embodiment, this filtering technique is referred to as a reference position-based filtering technique, or simply TF filtering technique or TF technology. The filtering-related parameters of the filter required for TF filtering are referred to as the filtering-related parameters of the reference position-based filter. That is, the "filtering-related parameters" mentioned in this embodiment refer to the filtering-related parameters of the reference position-based filter. The APS including these filtering-related parameters is called TF_APS. Furthermore, it should be noted that the current filtering position can also be understood as the current position, i.e., the position (x, y) of the current sample in the current block, where Mv is selected for inter-frame prediction in the current block, and BV is selected for IBC prediction in the current block. In one possible implementation, the filtered value is applied to the current filtering position to correct the reconstructed value at the current filtering position. The corrected reconstructed value at the current filtering position is equal to the sum of the filtered value and the reconstructed value before correction.
[0042] In related technologies, in addition to using Mv and BV to find the reference position, adaptive filtering using a reference image can also use a position in the reference image that has the same coordinates as the current filtering position as the reference position.
[0043] Encoding end: Based on the error between the corrected reconstructed value at the current filtering position and the original value at the current filtering position, the filter coefficients are fitted to obtain the filter coefficients. These filter coefficients are then written into the bitstream.
[0044] Since Mv and BV can be unidirectional or bidirectional, the filtering techniques described herein can be divided into unidirectional filtering and bidirectional filtering. The specific implementation methods of the filtering techniques are described below.
[0045] For example, in some embodiments, FIG1 is a schematic diagram of unidirectional filtering provided in the embodiments of this application. As shown in FIG1, in forward / backward filtering, MV can be used to offset the current filtering position (x, y), that is, MV can be used to offset the filtering center position of the reference image.
[0046] Since the current image may have one or more reference images, this means that the MV may point to any of these reference images, depending on the MV used by the coded block at the current filter position (x, y) and the reference image it points to.
[0047] For example, in some embodiments, Figure 2 is a schematic diagram of unidirectional filtering provided in the embodiments of this application. As shown in Figure 2, in forward / backward filtering, BV can be used to offset the position corresponding to the current filtering position (x, y), that is, BV can be used to offset the filtering center position of the current image.
[0048] When two reconstructed images are used as input, the corresponding position offset can be guided by two MV or BV.
[0049] For example, in some embodiments, FIG3 is a schematic diagram of bidirectional filtering provided in the embodiments of this application. As shown in FIG3, for bidirectional filtering or the case of two reconstructed images as input, the offset can be based on two MVs.
[0050] For example, in some embodiments, Figure 4 is a schematic diagram of bidirectional filtering proposed in the embodiments of this application. As shown in Figure 4, when bidirectional filtering is used as input, offset can be performed based on two BVs.
[0051] For example, in some embodiments, Figure 5 is a schematic diagram of bidirectional filtering provided in the embodiments of this application. As shown in Figure 5, a bidirectional filter combining an Mv and a Bv may also occur.
[0052] It is understandable that the loop filtering stage is performed on the reconstructed image after the current image has been reconstructed. Therefore, when performing temporal loop filtering, it can be determined whether the coding block at the current filtering position has selected intra-prediction, inter-prediction, or IBC prediction. For example, in the ECM reference software, if the block to which the current filtering position belongs is an intra-prediction block, and the intraTMP prediction mode is selected, one or more BVs will be cached in the motion buffer; if the IBC prediction mode is selected, one or more BVs will also be cached in the motion buffer; if the inter-prediction mode is selected, one or more MVs and the reference image index corresponding to each MV will be cached in the motion buffer.
[0053] Furthermore, for both unidirectional and bidirectional filtering, two filter shapes are available. For example, Figure 6 is a schematic diagram of the shapes of the two filters used in bidirectional filtering according to an embodiment of this application; Figure 7 is a schematic diagram of the shapes of the two filters used in bidirectional filtering according to an embodiment of this application; Figure 8 is a schematic diagram of the shape of one filter used in unidirectional filtering according to an embodiment of this application; and Figure 9 is a schematic diagram of the shape of one filter used in unidirectional filtering according to an embodiment of this application. It should be noted that for different filters, the filter coefficients at the same index position are the same.
[0054] The implementation method of the filtering technique is described below.
[0055] The implementation of adaptive filtering techniques based on reference images should include the following steps one through five.
[0056] Step 1: Parse sequence-level syntax elements, as shown in Table 1 below.
[0057] Table 1
[0058] The sequence identifier `sps_tf_enabled_flag` indicates whether the current sequence can use reference image-based adaptive filtering. When this syntax element is not present in the bitstream, its value is inferred to be 0. A value of 1 indicates that the current sequence can use reference image-based adaptive filtering, while a value of 0 indicates that reference image-based adaptive filtering cannot be used.
[0059] Step 2: Parse the slice-level syntax elements, as shown in Table 2 below.
[0060] Table 2
[0061] in:
[0062] The `sh_tf_enabled_flag` syntax element is a flag in the title indicating whether reference-based adaptive filtering is enabled. If this syntax element is not present in the bitstream, its value is inferred to be 0. A value of 1 indicates that reference-based adaptive filtering can be used in the current slice; a value of 0 indicates that reference-based adaptive filtering is disabled.
[0063] The `tf_filter_mode` syntax element indicates which adaptive filtering technique mode based on the reference image is used for the current slice. The values 0 to 5 indicate which of the six filtering modes (one-way and two-way) is used for the current slice. These six filtering modes include, but are not limited to: forward filtering mode based on inter-frame prediction, backward filtering mode based on inter-frame prediction, two-way filtering mode based on inter-frame prediction, forward filtering mode based on the nearest time in the temporal domain, backward filtering mode based on the nearest time in the temporal domain, and two-way filtering mode based on the nearest time in the temporal domain. It is understandable that when the prediction mode of the current block is inter-frame prediction mode, the current block can select reference images for inter-frame prediction from the forward reference image list and / or backward reference image list. If forward inter-frame prediction is selected, the forward filtering mode based on inter-frame prediction is used, and the reference image is the selected forward reference image; if backward inter-frame prediction is selected, the backward filtering mode based on inter-frame prediction is used, and the reference image is the selected backward reference image; if bidirectional inter-frame prediction is selected, the bidirectional filtering mode based on inter-frame prediction is used, and the reference images are the selected forward reference image and the backward reference image. The above three filtering modes based on the closest time domain are filtering modes based on the closest time domain reference images. For example, taking the bidirectional filtering mode based on the closest time domain as an example, assuming the current frame is the 9th frame, then the two closest time domain reference images are the 10th frame reference image and the 8th frame reference image. The reconstructed values of the corresponding reference positions of the current position on these two reference images are used as the input of the filter, that is, the corresponding reference position is the center position of the filter, and the reconstructed values of the area covered by the filter on the reference image are used as the input of the filter.
[0064] The `tf_num_filters_signalled_minus1` syntax element indicates the number of filters in the current slice minus one. In one implementation, since a slice can have a maximum of 8 adaptive filters based on the reference image, the value of `tf_num_filters_signalled_minus1` can be 0, 1, 2, ..., or 7. A value of 0 indicates that the current slice has one adaptive filter based on the reference image; a value of 1 indicates that the current slice has two filters; a value of 2 indicates that the current slice has three filters; and so on. When this syntax element is not present in the bitstream, its value is 0.
[0065] The variable numCoeff represents the number of filter coefficients in the filter.
[0066] The `tf_coeff_abs[sfIdx][j]` syntax element indicates the absolute value of the j-th coefficient of the filter group `sfIdx`, represented using K-order exponential Golomb code where K is 0. Its value is 0 when this syntax element is not present in the bitstream.
[0067] The `tf_coeff_sign[sfIdx][j]` syntax element indicates the sign of the j-th coefficient of the filter group `sfIdx`. A value of 1 indicates a negative coefficient, and a value of 0 indicates a positive coefficient. Its value is 0 when this syntax element is not present in the bitstream.
[0068] In one implementation, the unidirectional and bidirectional filters parsed from each slice can be stored in a buffer for reuse in subsequent image decoding. Currently, up to 8 sets of unidirectional filters and 8 sets of bidirectional filters can be stored, and the buffer adopts a FIFO format.
[0069] The `tf_reuse_flag` syntax element is an indicator that specifies whether the current slice encoding / decoding reuses the historical filter. A value of 1 indicates reuse of the historical filter, while a value of 0 indicates no reuse. When this flag is not present in the bitstream, its value is 0.
[0070] The `tf_reuse_index` syntax element represents the index of the historical filters multiplexed in the current slice. Therefore, the index value can be 0, 1, 2, 3, 4, 5, 6, or 7, represented using a fixed-length code of 3 bins. When this syntax element is not present in the bitstream, its value is 0. The index value is used to indicate a set of filters multiplexed in the buffer.
[0071] `tf_shift_minus6` indicates the coefficient precision used by the current filter. It means subtracting 6 from the shift value used in the filter. In one implementation, this syntax element is represented by a fixed-length code of length 2 bins, with values of 0, 1, 2, or 3. Its value is 0 when this syntax element is not present in the bitstream.
[0072] `tf_k_order` represents the exponential Golomb order used when calculating the absolute value of the filter encoding / decoding coefficients for the current slice. Its value is either 0 or 1; 0 indicates the use of 0th-order exponential Golomb code, and 1 indicates the use of 1st-order exponential Golomb code. Its value is 0 when this syntax element is not present in the bitstream.
[0073] tf_shape_index represents the shape of the current filter. As shown in Figures 6-9, there are two shapes for both unidirectional and bidirectional filtering. tf_shape_index equal to 0 means shape 0 is selected, and equal to 1 means shape 1 is selected.
[0074] `tf_clip_flag` indicates whether the filter for the current slice uses non-linear truncation. This syntax element takes the value 0 or 1; 1 indicates that the filter uses non-linear truncation, and 0 indicates that it does not. Its value is 0 when this syntax element is not present in the bitstream.
[0075] `tf_clip_idx` represents the nonlinear index value used by the current filter coefficient in the current slice. This syntax element takes the value 0, 1, 2, or 3. A value of 0 indicates that the input of the filter coefficient at that position does not use nonlinearity. 1, 2, and 3 represent the indices of the corresponding nonlinear truncation values, used to retrieve the truncation values from a preset table. This syntax element uses a fixed-length code encoding / decoding of 2 bins. Its value is 0 if this syntax element is not present in the bitstream.
[0076] Step 3: Parse the code tree block syntax elements, as shown in Table 3 below.
[0077] Table 3
[0078] The syntax element `tf_ctb_idc[CtbAddrX][CtbAddrY]` indicates whether the coded tree luma block of the horizontal direction (CtbAddrX) and vertical direction (CtbAddrY) uses adaptive filtering based on the reference image. When `tf_reuse_flag` is 0, its value ranges from 0 to `tf_num_filters_signalled_minus1`. For example, if `tf_num_filters_signalled_minus1` is 1, then the value of `tf_ctb_filter_idx` can be 0 or 1. Similarly, if `tf_num_filters_signalled_minus1` is 2, then the value of `tf_ctb_filter_idx` can be 0, 1, or 2. When `tf_reuse_flag` is 1, the current coded tree luma block reuses a set of filters from a historically saved frame. The value range of `tf_ctb_filter_idx` depends on the number of historical filters reused.
[0079] Step 4: Reconstruct the filter coefficients, as described below.
[0080] If the current slice's tf_ctb_enabled is not equal to 0 and tf_reuse_flag is 0, the coefficients, shift values, and nonlinear constraint values of the adaptive filter based on the reference image need to be reconstructed.
[0081] The process of obtaining the filter coefficients tfCoeff of the current slice is as follows:
[0082] The process of obtaining the nonlinear cutoff value tfClip for the current slice is as follows:
[0083] - Construct a non-linear truncation value table based on the pixel depth inputBitdepth of the luminance component.
[0084] - Obtain the nonlinear cutoff value for each coefficient of each filter in the current slice:
[0085] The process of obtaining the shift value of the adaptive loop filter for the current slice is as follows:
[0086] shift = tf_shift_minus6 + 6
[0087] If the current slice's tf_ctb_enabled is not equal to 0 and tf_reuse_flag is 1, the coefficient values, shift values, and nonlinear limit values of the selected historical filter need to be obtained from the FIFO of historical adaptive filtering techniques based on reference images.
[0088] The process of obtaining the filter coefficients tfCoeff of the current slice is as follows:
[0089] The process of obtaining the nonlinear cutoff value tfClip for the current slice is as follows:
[0090] - Construct a non-linear truncation value table based on the pixel depth inputBitdepth of the luminance component.
[0091] - Obtain the nonlinear cutoff value for each coefficient of each filter in the current slice:
[0092] The process of obtaining the shift value of the adaptive loop filter for the current slice is as follows:
[0093] shift=tfParamPool[poolIdx][tf_reuse_idx].shift
[0094] Step 5: Perform TF filtering on the luminance coding tree block, as described below.
[0095] This step uses the filter shape 0 used for bidirectional filtering in Figure 6 and the filter shape 0 used for unidirectional filtering in Figure 7 as examples to describe the filtering process under 6 filtering modes. If filter shape 1 is selected in unidirectional filtering mode and bidirectional filtering mode, the corresponding filter input position should be modified according to the input in Figure 8 and Figure 9.
[0096] If the tf_ctb_enbaled flag of the current luminance coding tree block is non-zero, then TF filtering is required. This step first requires obtaining the following variable values:
[0097] - A reconstructed luminance image array rec, after luminance adaptive loop filtering.
[0098] -tf_filter_mode mode value
[0099] - Indicates the luminance coordinates (xCtb, yCtb) of the current luminance coding tree block in the current image.
[0100] - The width (tfWidth) and height (tfHeight) of the luminance coding tree block.
[0101] - The coefficients of the time-domain adaptive loop filter, tfCoeff[][],
[0102] - The cutoff value tfClip[][] of the time-domain adaptive loop filter.
[0103] -Time-domain adaptive loop filter shift value
[0104] The first step is to obtain the reconstructed image and MV:
[0105] The input image of the filter is determined based on tf_filter_mode.
[0106] If tf_filter_mode is 0, then the nearest reconstructed image rec0 is used as input;
[0107] If tf_filter_mode is 3, then check the motion vector buffer corresponding to the current position (x,y), and check whether the motion vector buffer contains Mv0 pointing to the reference image in the reference image list 0. If it contains it, then use the reference image rec0 pointed to by Mv0 as input; otherwise, skip the filtering at the current position.
[0108] If tf_filter_mode is 1, then the backward nearest reconstructed image rec1 is used in RA mode, and the forward second nearest reconstructed image rec1 is used as input in LD mode.
[0109] If tf_filter_mode is 4, then check the motion vector buffer corresponding to the current position (x,y), and check whether the motion vector buffer contains Mv1 pointing to the reference image in reference image list 1. If it contains it, then use the reference image rec1 pointed to by Mv1 as input; otherwise, skip the filtering at the current position.
[0110] If tf_filter_mode is 2, then in RA mode, the forward nearest reconstructed image rec0 and the backward nearest reconstructed image rec1 are used as input, and in LD mode, the forward nearest reconstructed image rec0 and the second nearest reconstructed image rec1 are used as input.
[0111] If tf_filter_mode is 5, then check the motion vector buffer corresponding to the current position (x,y). Check if the motion vector buffer contains Mv0 pointing to the reference image in reference image list 0 and Mv1 pointing to the reference image in reference image list 1. If it contains them, then use the reference image rec0 pointed to by Mv0 and the reference image rec1 pointed to by Mv1 as input. Otherwise, skip the filtering at the current position.
[0112] The second step is to obtain the shift value:
[0113] If Mv0 and Mv1 are found in the first step, then since Mv is used in inter-frame prediction with subpixel precision, the position offset on the reconstructed image pointed to by Mv is obtained here by rounding to positive pixel precision.
[0114] The horizontal integer pixel offset of Mv0 is:
[0115] Offset0X=Mv0.Hor<0? -((abs(Mv0.Hor)+8)>>4):((abs(Mv0.Hor)+8)>>4)
[0116] The vertical integer pixel position offset of Mv0 is:
[0117] Offset0Y=Mv0.Ver<0? -((abs(Mv0.Ver)+8)>>4):((abs(Mv0.Ver)+8)>>4)
[0118] The horizontal integer pixel offset of Mv1 is:
[0119] Offset1X=Mv1.Hor<0? -((abs(Mv1.Hor)+8)>>4):((abs(Mv1.Hor)+8)>>4)
[0120] The vertical integer pixel position offset of Mv1 is:
[0121] Offset1Y=Mv1.Ver<0? -((abs(Mv1.Ver)+8)>>4):((abs(Mv1.Ver)+8)>>4)
[0122] If tf_filter_mode is 0, 1, or 2 in the first step, since it is not necessary to export the offset value based on Mv, Offset0X, Offset0Y, Offset1X, and Offset1Y are set to 0.
[0123] The third step is to reconstruct the image based on the obtained positional offset and then filter it using filtering coefficients.
[0124] The filtering calculation for each position in the luminance-coded block is as follows:
[0125] In the above calculation and filtering process, the BitDepth variable represents the bit depth of the luminance component, and x and y represent the horizontal and vertical coordinates of the reconstructed image array. x0, y0, x1, y1 are the coordinates of the center positions of rec0 and rec1 after offset, respectively. When using the reconstructed values in the rec0 and rec1 arrays, the horizontal coordinate should be limited to between 0 and the image width picWidth-1, and the vertical coordinate should be limited to between 0 and the image height picHeight-1. K(a,b)=min(b,max(-b,a))
[0126] When tf_filter_mode is 5, compared to tf_filter_mode 2, filtering for positions without corresponding motion information needs to be skipped.
[0127] When tf_filter_mode is 0:
[0128] When tf_filter_mode is 3, compared to tf_filter_mode is 0, filtering for positions without corresponding motion information needs to be skipped.
[0129] When tf_filter_mode is 1:
[0130] When tf_filter_mode is 4, compared to tf_filter_mode is 1, filtering for positions without corresponding motion information needs to be skipped.
[0131] During the filtering process, the above scheme uses the same image edge padding method as ALF to obtain the TF input values outside the required image range for the image boundaries.
[0132] In some embodiments, ALF filtering performs edge padding on a block-by-block basis. Filtering within a block does not use the reconstructed values of other blocks. In this case, TF can also perform edge padding on a block-by-block basis.
[0133] In adaptive filtering techniques based on reference images, the adaptive filter coefficients need to be transmitted through the bitstream and can be reused. However, the inventors of this application discovered during the implementation and analysis of the above-mentioned scheme that, under the relevant design, the filter coefficients are transmitted at the slice level. However, the same set of coefficients is always used for the same frame image, so transmitting coefficients at the slice level is unreasonable.
[0134] In view of this, in the embodiments of this application, the relevant adaptive parameters of the adaptive filtering (TF) based on the reference image are encoded and decoded as a new APS type in the APS.
[0135] One embodiment of this application provides an encoding method, namely, determining the filtering correlation parameters of the current block, wherein the filtering correlation parameters are filtering correlation parameters of a filter based on a reference position; wherein the reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image; determining the filtering coefficients of the filter of the current block according to the filtering correlation parameters of the current block; and filtering the reconstructed value of the current position according to the reference position of the current position and the filtering coefficients of the filter of the current block.
[0136] This application embodiment also provides a decoding method, namely, obtaining the filter correlation parameters of the current block from the APS, wherein the filter correlation parameters are filter correlation parameters of a filter based on a reference position; wherein the reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image; determining the filter coefficients of the filter of the current block according to the filter correlation parameters of the current block; and filtering the reconstructed value of the current position according to the reference position of the current position and the filter coefficients of the filter of the current block.
[0137] It is understood that in the embodiments of this application, the filter-related parameters of the filter based on the reference position are placed in the APS rather than at the image level or slice level. This is because, regardless of whether these filter-related parameters are encoded or decoded at the image level or slice level, a corresponding cache is needed to handle the storage of the parsed coefficients. However, the inventors of this application have found that the updates and parsing of these filter-related parameters are not actually that frequent and do not need to be updated at the image level or slice level. Therefore, placing these filter-related parameters in the APS reduces the cache size at the slice level and image level, thus saving cache resources.
[0138] The embodiments of this application will now be described in detail with reference to the accompanying drawings.
[0139] Figure 10 is a schematic diagram of a video encoding and decoding network architecture provided in an embodiment of this application. As shown in Figure 10, the network architecture includes one or more electronic devices 31 to 3N and a communication network 01, wherein the electronic devices 31 to 3N can perform video interaction through the communication network 01. The electronic devices can be various types of devices with video encoding and decoding capabilities, such as mobile phones, tablets, personal computers, personal digital assistants, navigators, digital phones, video phones, televisions, sensing devices, servers, etc., and this embodiment of the application does not limit the scope of the application.
[0140] This application provides a network architecture for a video encoding / decoding system that includes decoding and encoding methods. The decoder or encoder in this application can be the aforementioned electronic device, or the aforementioned electronic device may include a decoder or encoder. In other words, the electronic device in this application has video encoding / decoding capabilities and generally includes a video encoder (i.e., encoder) and a video decoder (i.e., decoder).
[0141] Figure 11 is a schematic block diagram of an encoder system according to an embodiment of this application. As shown in Figure 11, the encoder 100 may include: a segmentation unit 101, a prediction unit 102, a first adder 107, a transform unit 108, a quantization unit 109, an inverse quantization unit 110, an inverse transform unit 111, a second adder 112, a filtering unit 113, a Decoded Picture Buffer (DPB) unit 114, and an entropy coding unit 115. Here, the input of the encoder 100 can be a video composed of a series of images or a single static image, and the output of the encoder 100 can be a bitstream (also called a "bitstream") representing a compressed version of the input video.
[0142] The segmentation unit 101 segments the images in the input video into one or more Coding Tree Units (CTUs). The segmentation unit 101 divides the image into multiple tiles, and can further divide a tile into one or more bricks. Here, a tile or a brick can include one or more complete and / or partial CTUs. Additionally, the segmentation unit 101 can form one or more slices, where a slice can include one or more tiles arranged in raster order in the image, or one or more tiles covering a rectangular area of the image. The segmentation unit 101 can also form one or more sub-images, where a sub-image can include one or more slices, tiles, or bricks.
[0143] During the encoding process of encoder 100, segmentation unit 101 transmits the CTU to prediction unit 102. Typically, prediction unit 102 may consist of block segmentation unit 103, motion estimation (ME) unit 104, motion compensation (MC) unit 105, and intra-prediction unit 106. Specifically, block segmentation unit 103 iteratively uses quadtree segmentation, binary tree segmentation, and ternary tree segmentation to further divide the input CTU into smaller coding units (CUs). Prediction unit 102 can use ME unit 104 and MC unit 105 to obtain inter-frame prediction blocks of the CUs. Intra-prediction unit 106 can use various intra-prediction modes, including MIP modes, to obtain intra-frame prediction blocks of the CUs. In the example, rate-distortion optimized motion estimation can be invoked by ME unit 104 and MC unit 105 to obtain inter-frame prediction blocks, and rate-distortion optimized mode determination can be invoked by intra-prediction unit 106 to obtain intra-frame prediction blocks.
[0144] Prediction unit 102 outputs the predicted block of the CU. First adder 107 calculates the difference between the CU in the output of segmentation unit 101 and the predicted block of the CU, i.e., the residual CU. Transform unit 108 reads the residual CU and performs one or more transform operations on the residual CU to obtain coefficients. Quantization unit 109 quantizes the coefficients and outputs quantization coefficients (i.e., levels). Inverse quantization unit 110 performs scaling operations on the quantization coefficients to output reconstructed coefficients. Inverse transform unit 111 performs one or more inverse transforms corresponding to the transforms in transform unit 108 and outputs the reconstructed residual. Second adder 112 calculates the reconstructed CU by adding the reconstructed residual to the predicted block of the CU from prediction unit 102. Second adder 112 also sends its output to prediction unit 102 as an intra-frame prediction reference. After all CUs in the image or sub-image are reconstructed, filtering unit 113 performs loop filtering on the reconstructed image or sub-image. Here, the filtering unit 113 includes one or more filters, such as a deblocking filter, a sample adaptive offset (SAO) filter, an adaptive loop filter (ALF), a luma mapping with chroma scaling (LMCS) filter, and a neural network-based filter. Alternatively, when the filtering unit 113 determines that the CU is not used as a reference for encoding other CUs, the filtering unit 113 performs loop filtering on one or more target samples in the CU.
[0145] The output of filtering unit 113 is a decoded image or sub-image, which is buffered in DPB unit 114. DPB unit 114 outputs the decoded image or sub-image according to timing and control information. Here, the image stored in DPB unit 114 can also be used as a reference for prediction unit 102 to perform inter-frame prediction or intra-frame prediction. Finally, entropy coding unit 115 converts the parameters (such as control parameters and supplementary information) necessary for decoding the image from encoder 100 into binary form, and writes such binary form into the bitstream according to the syntax structure of each data unit, which is the final output bitstream of encoder 100.
[0146] Furthermore, encoder 100 may be a first memory having a first processor and a computer program for recording. When the first processor reads and runs the computer program, encoder 100 reads the input video and generates a corresponding bitstream. Alternatively, encoder 100 may also be a computing device having one or more chips. These units, implemented as integrated circuits on the chips, have connection and data exchange functions similar to the corresponding units in Figure 11.
[0147] Figure 12 is a schematic block diagram of a decoder system according to an embodiment of this application. As shown in Figure 12, the decoder 180 may include: a decoding unit 201, a prediction unit 202, an inverse quantization unit 205, an inverse transform unit 206, an adder 187, a filtering unit 208, and a decoded image buffer unit 209. Here, the input of the decoder 180 is a bitstream representing a compressed version of a video or a still image, and the output of the decoder 180 may be a decoded video composed of a series of images or a decoded still image.
[0148] The input bitstream to decoder 180 can be the bitstream generated by encoder 100. Decoding unit 201 parses the input bitstream and obtains the values of syntax elements from it. Decoding unit 201 converts the binary representation of the syntax elements into digital values and sends these digital values to units in decoder 180 to obtain one or more decoded images. Decoding unit 201 can also parse one or more syntax elements from the input bitstream to display the decoded images.
[0149] During the decoding process of decoder 180, decoding unit 201 sends the value of the syntax element and one or more variables set or determined according to the value of the syntax element for obtaining one or more decoded images to the unit in decoder 180.
[0150] Prediction unit 202 determines the prediction block of the current decoded block (e.g., CU). Here, prediction unit 202 may include motion compensation unit 203 and intra-prediction unit 204. Specifically, when an inter-frame decoding mode is indicated for decoding the current decoded block, prediction unit 202 transmits relevant parameters from decoding unit 201 to motion compensation unit 203 to obtain inter-frame prediction blocks; when an intra-frame prediction mode (including MIP mode indicated by MIP mode index value) is indicated for decoding the current decoded block, prediction unit 202 transmits relevant parameters from decoding unit 201 to intra-prediction unit 204 to obtain intra-frame prediction blocks.
[0151] The dequantization unit 205 has the same function as the dequantization unit 110 in the encoder 100. The dequantization unit 205 performs a scaling operation on the quantization coefficients (i.e., levels) from the decoding unit 201 to obtain reconstruction coefficients. The inverse transform unit 206 has the same function as the inverse transform unit 111 in the encoder 100. The inverse transform unit 206 performs one or more transform operations (i.e., the inverse operations of one or more transform operations performed by the inverse transform unit 111 in the encoder 100) to obtain reconstruction residuals. The adder 187 performs an addition operation on its inputs (the prediction block from the prediction unit 202 and the reconstruction residuals from the inverse transform unit 206) to obtain the reconstruction block of the current decoded block. The reconstruction block is also sent to the prediction unit 202 as a reference for other blocks encoded in intra-frame prediction mode.
[0152] After all CUs in an image or sub-image are reconstructed, filtering unit 208 performs loop filtering on the reconstructed image or sub-image. Filtering unit 208 includes one or more filters, such as deblocking filters, sampling adaptive compensation filters, adaptive loop filters, luminance mapping and chroma scaling filters, and neural network-based filters. Alternatively, when filtering unit 208 determines that a reconstructed block is not used as a reference for decoding other blocks, filtering unit 208 performs loop filtering on one or more target samples in the reconstructed block. Here, the output of filtering unit 208 is a decoded image or sub-image, which is buffered in DPB unit 209. DPB unit 209 outputs the decoded image or sub-image based on timing and control information. The image stored in DPB unit 209 can also be used as a reference for performing inter-frame prediction or intra-frame prediction by prediction unit 202.
[0153] Furthermore, decoder 180 may be a second memory having a second processor and a computer program for recording. When the first processor reads and runs the computer program, decoder 180 reads the input bitstream and generates the corresponding decoded video. Alternatively, decoder 180 may also be a computing device having one or more chips. These units, implemented as integrated circuits on the chips, have similar connection and data exchange functions to the corresponding units in Figure 12.
[0154] It should also be noted that when the embodiments of this application are applied to the encoder 100, the "current block" specifically refers to the block to be encoded in the video image (which can also be simply referred to as the "encoded block"); when the embodiments of this application are applied to the decoder 180, the "current block" specifically refers to the block to be decoded in the video image (which can also be simply referred to as the "decoded block").
[0155] This application provides a decoding method that is applied to a decoder.
[0156] Figure 13 is a schematic diagram of the implementation flow of the decoding method provided in the embodiment of this application. As shown in Figure 13, the method may include the following steps 1301 to 1303:
[0157] Step 1301: Obtain the filter correlation parameters of the current block from the APS. The filter correlation parameters are filter correlation parameters of the filter based on the reference position. The reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image.
[0158] Step 1302: Determine the filter coefficients of the current block based on the filter-related parameters of the current block;
[0159] Step 1303: Filter the reconstructed value of the current position based on the reference position of the current position and the filter coefficients of the filter of the current block.
[0160] It is understood that in the embodiments of this application, the filter-related parameters of the filter based on the reference position are placed in the APS rather than at the image level or slice level. This is because, regardless of whether these filter-related parameters are encoded or decoded at the image level or slice level, a corresponding cache is needed to handle the storage of the parsed coefficients. However, the inventors of this application have found that the updates and parsing of these filter-related parameters are not actually that frequent and do not need to be updated at the image level or slice level. Therefore, placing these filter-related parameters in the APS reduces the cache size at the slice level and image level, thus saving cache resources.
[0161] The following sections will describe further optional implementation methods for each of the above steps, as well as related terms.
[0162] Step 1301: Obtain the filter correlation parameters of the current block from the APS. The filter correlation parameters are the filter correlation parameters of the filter based on the reference position. The reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image.
[0163] It should be noted that the encoding and decoding method provided in this application describes a reference position-based filtering technique, which can also be referred to as TF technique. In this TF technique, the reconstructed value of the reference position pointed to by the current position Mv or Bv is used as the input information of the filter, or the reconstructed value of the current position at the same position in the reference image (such as the reference image that is closest in the forward temporal domain or the reference image that is closest in the backward temporal domain) is used as the input information of the filter.
[0164] In some embodiments, as shown in FIG14, step 1301, obtaining the filter correlation parameters of the current block based on the reference position from the APS, may further include the following steps 1401 and 1402:
[0165] Step 1401: parse the first syntax element and the second syntax element in the bitstream; wherein, the first syntax element is used to indicate the index of the APS used for filtering the current block in the first APS candidate list; the second syntax element is used to indicate the index of the filtering-related parameters of the current block in the APS indicated by the first syntax element.
[0166] For example, suppose the TF_APS parsed in the slice header is the 0th or 3rd; based on this, when determining whether the current block uses the 0th or 3rd TF_APS, the value of the first syntax element here is not 0 or 3, but 0 or 1. This is because a first APS candidate list is pre-constructed at the slice level, and 0 and 1 are the index values of the TF_APS in this APS candidate list. In one possible implementation, the first syntax element can be tf_setId; where tf_setId is a block-level syntax element, which can be indicated in the coding tree unit coding_tree_unit().
[0167] It should be noted that TF_APS refers to a dataset of filter-related parameters for a reference-position-based filter. These parameters are used for reference-position-based filtering techniques. One or more TF_APS can be used for reference-position-based filtering techniques. In one possible implementation, the number of TF_APS available for the current sequence, image, or slice can be indicated in sequence-level, image-level, or slice-level syntax elements; different blocks may use different TF_APS.
[0168] A single TF_APS can include filter-related parameters for multiple filters. Here, "filter" refers to a reference-position-based filter. Which filter's filter-related parameters from the TF_APS should be used when performing reference-position-based filtering on the current block is determined by the value of the second syntax element. For example, assuming the TF_APS includes filter-related parameters for three filters, the value of the second syntax element can be 0, 1, or 2. In one possible implementation, the second syntax element can be tf_ctb_filter_idx; where tf_ctb_filter_idx is a block-level syntax element and can be indicated in the coding tree unit (coding_tree_unit()).
[0169] Step 1402: Based on the values of the first syntax element and the second syntax element, obtain the filtering-related parameters of the current block from the first APS candidate list.
[0170] It is understandable that the filtering-related parameters of the corresponding filter can be obtained from the first APS candidate list based on the values of the first and second syntax elements.
[0171] Further, in some embodiments, for step 1401, parsing the first and second syntax elements in the bitstream may include: parsing a third syntax element in the bitstream; wherein the third syntax element is used to indicate the number of APSs used in the current slice or the current image; the APS includes filtering correlation parameters of one or more filters based on reference positions, where APS refers to TF_APS; when the value of the third syntax element is greater than 0, the first and second syntax elements in the bitstream are parsed.
[0172] In this embodiment, the third syntax element is a syntax element that indicates the number of APSs used in the current slice or current image minus 1. Taking a maximum of 8 TF_APS as an example, the value range of this third syntax element should be 0 to 7. In one possible implementation, the third syntax element can be sh_num_tf_aps_ids_minus1 or ph_num_tf_aps_ids_minus1. Here, sh_num_tf_aps_ids_minus1 is a slice-level syntax element, which can be indicated in the slice_header() function; ph_num_tf_aps_ids_minus1 is an image-level syntax element, which can be indicated in the picture_header_structure() function.
[0173] It should be noted that, in the embodiments of this application, the parsing of the first and second syntax elements in the bitstream can be unconditional or conditional. An unconditional approach means that it is not necessary to determine whether the value of the third syntax element is greater than 0; instead, the syntax elements carried in the bitstream are parsed sequentially according to the receiving order. A conditional approach means that the first and second syntax elements in the bitstream are parsed only when the value of the third syntax element is greater than 0, that is, when the number of TF_APS used in the current slice or current image is greater than 1.
[0174] Furthermore, in some embodiments, parsing the first and second syntax elements in the bitstream when the value of the third syntax element is greater than 0 includes: parsing the fourth syntax element in the bitstream; and parsing the first and second syntax elements in the bitstream when the value of the fourth syntax element indicates that the current block uses a reference position-based filtering technique and the value of the third syntax element is greater than 0.
[0175] In one possible implementation, the fourth syntax element could be tf_ctb_enabled; where tf_ctb_enabled is a block-level syntax element that can be indicated in the coding tree unit coding_tree_unit().
[0176] Furthermore, in some embodiments, the parsing of the fourth syntax element in the bitstream includes: parsing the fifth syntax element in the bitstream; and parsing the fourth syntax element in the bitstream when the value of the fifth syntax element indicates that the current slice or the current image uses a reference position-based filtering technique.
[0177] It is understandable that the fifth syntax element indicates whether the current slice or the current image uses a reference position-based filtering technique. However, some coding tree units in the current slice or the current image may use a reference position-based filtering technique, while others may not. Therefore, if the fifth syntax element in the bitstream is parsed to determine whether the current slice or the current image uses a reference position-based filtering technique, the fourth syntax element in the bitstream needs to be further parsed to determine whether the current block uses a reference position-based filtering technique.
[0178] In one possible implementation, the fifth syntax element can be either sh_tf_enabled_flag or ph_tf_enabled_flag; where sh_tf_enabled_flag is a slice-level syntax element that can be indicated in slice_header(); and ph_tf_enabled_flag is an image-level syntax element that can be indicated in image_header_structure().
[0179] It is understandable that, since the filtering-related parameters of a filter are moved to the APS, the range of values for the filter index used to parse each coding tree unit should depend on which APS the filter exists in (the number of filters differs in different APSs). In one possible implementation, an example of a coding tree unit-level resolution determining whether the current coding tree luminance component uses a reference position-based filtering technique (i.e., TF filtering) and, if so, which filter to use, is shown in Table 4 below.
[0180] Table 4
[0181] Similar to the related technologies mentioned earlier, the `tf_ctb_enabled` flag (which is also an example of the fourth syntax element) is used to determine whether the current coding tree unit uses TF filtering (i.e., reference position-based filtering). A value of 1 indicates that it is used, otherwise it is not. If this syntax element does not exist in the bitstream, the flag is 0.
[0182] Unlike the related technologies mentioned earlier,
[0183] The `tf_setId` syntax element (an example of the first syntax element) is used to determine which TF_APS the selected filter group should be retrieved from when the current coding tree unit uses TF filtering. The index of the selected TF_APS should be equal to `sh_tf_aps_id[tf_setId]`, and the selected filter group is found through the TF_APS index. The value of `tf_setId` should range from 0 to `sh_num_tf_aps_ids_minus1`. Its value defaults to 0 when the bitstream does not contain this syntax element.
[0184] `tf_ctb_filter_idx` identifies which filter from a selected set of TF filters (i.e., reference position-based filters) will be used for filtering the current coding tree luma block. Specifically, it needs to further determine the selected filter coefficients, filter adaptive accuracy, filter shape, filter nonlinear cutoff value, etc., from the selected TF_APS. If there are N filters in the selected TF_APS, the value of this syntax element should be in the range of 0 to N-1.
[0185] Specifically, the selected filter coefficients should first be found in the TF_APS type APS dataset with index sh_tf_aps_id[tf_setId], and then the values in this dataset should be found based on the value of tf_ctb_filter_idx.
[0186] 1. The absolute value array of filter coefficients: tf_coeff_abs[tf_ctb_filter_idx]
[0187] 2. The filter coefficient sign array tf_coeff_sign[tf_ctb_filter_idx],
[0188] 3. Filter cutoff value index array tf_clip_idx[tf_ctb_filter_idx].
[0189] These information are used to construct the filter coefficients and cutoff values, which are then applied to the filtering of the current luminance block in the coding tree.
[0190] For example, each selected filter coefficient tfCoeff[i] is:
[0191] tfCoeff[i]=tf_coeff_sign[tf_ctb_filter_idx][i]==1?
[0192] -tf_coeff_abs[tf_ctb_filter_idx][i]*tf_coeff_abs[tf_ctb_filter_idx][i]
[0193] Where i is the index of each filtering position of the filter. For example, in a filter with 13 coefficients, the range of i is 0 to 12.
[0194] Each selected filter cutoff value tfClip[i] is:
[0195] tfClip[i]=1<<(7-tf_clip_idx[tf_ctb_filter_idx][i]*2+(inputBitdepth-8))
[0196] Where i is the index number of each stage of the filter. For example, in a filter with 13 coefficients, the range of i is 0 to 12.
[0197] In some embodiments, the method for constructing the first APS candidate list further includes: parsing a seventh syntax element in the bitstream based on a third syntax element; wherein the seventh syntax element is used to indicate the index of the APS used in the current slice or the current image; obtaining the corresponding APS based on the value of at least one parsed seventh syntax element; and constructing the first APS candidate list based on the value of at least one parsed seventh syntax element and the corresponding APS. Here, APS refers to TF_APS, which includes filtering-related parameters used for reference-position-based filtering techniques.
[0198] For example, the corresponding APSs are arranged sequentially in the first APS candidate list according to the values of the seventh syntax element in ascending (or descending) order. For instance, suppose the TF_APS parsed in the slice header (or image header) is the 0th and 3rd; based on this, when determining whether the current block uses the 0th or 3rd TF_APS, the value of the first syntax element here is not 0 or 3, but 0 or 1. This is because a first APS candidate list is pre-constructed at the slice level (or image level), and 0 and 1 are the index values of the TF_APS in this APS candidate list.
[0199] As mentioned earlier, the third syntax element is used to indicate the number of APSs used in the current slice or the current image; the APS includes filter-related parameters of one or more filters based on the reference position, where APS refers to TF_APS; for example, the third syntax element is sh_num_tf_aps_ids_minus1 or ph_num_tf_aps_ids_minus1.
[0200] In one possible implementation, the seventh syntax element can be either sh_tf_aps_id or ph_tf_aps_id; where sh_tf_aps_id is a slice-level syntax element, which can be indicated in slice_header(); and ph_tf_aps_id is an image-level syntax element, which can be indicated in image_header_structure().
[0201] The process of parsing the seventh syntax element in the code stream based on the third syntax element can be implemented as follows:
[0202] Alternatively, the step of parsing the seventh syntax element in the code stream based on the third syntax element can also be implemented as follows:
[0203] As mentioned earlier, parsing the third syntax element in the bitstream, and parsing the seventh syntax element in the bitstream based on the third syntax element, may further include, in some embodiments, parsing the eighth syntax element in the bitstream; wherein the eighth syntax element is used to indicate the location of the control identifier of the filtering technique based on the reference position; when the value of the eighth syntax element indicates that the control identifier exists in the image header, the third syntax element carried in the image header in the bitstream is parsed, and the seventh syntax element carried in the image header in the bitstream is parsed based on the third syntax element.
[0204] In the embodiments of this application, the control identifier of the reference position-based filtering technique includes, but is not limited to, a third syntax element and a seventh syntax element; wherein, the third syntax element is used to indicate the number of APSs used in the current slice or the current image; the APS includes filtering-related parameters of one or more reference position-based filters, where APS refers to TF_APS; the seventh syntax element is used to indicate the index of the APS used in the current slice or the current image.
[0205] It should be understood that, when the value of the eighth syntax element indicates that the control identifier exists in the image header, the third syntax element carried in the image header of the bitstream is parsed, and based on the third syntax element, the seventh syntax element carried in the image header of the bitstream is parsed. The third and seventh syntax elements, carried in the image header, are image-level syntax elements. For example, the third syntax element could be the image-level `ph_num_tf_aps_ids_minus1`, which can be found in the image header `picture_header_structure()`. The seventh syntax element could be the image-level `ph_tf_aps_id`, which can also be found in the image header `picture_header_structure()`.
[0206] Alternatively, in other embodiments, parsing the third syntax element in the bitstream and, based on the third syntax element, parsing the seventh syntax element in the bitstream may include: parsing the third syntax element carried in the bitstream header when the value of the eighth syntax element indicates that the control identifier exists in the bitstream header, and parsing the seventh syntax element carried in the bitstream header based on the third syntax element.
[0207] Alternatively, in some other embodiments, parsing the third syntax element in the bitstream and, based on the third syntax element, parsing the seventh syntax element in the bitstream may include: parsing the third syntax element carried in the header of the bitstream if the eighth syntax element is not obtained by parsing the bitstream, and parsing the seventh syntax element carried in the header of the bitstream based on the third syntax element.
[0208] It should be understood that when the value of the eighth syntax element indicates the control flag is present in the slice header, the third syntax element carried in the slice header of the bitstream is parsed, and based on the third syntax element, the seventh syntax element carried in the slice header of the bitstream is parsed. The third and seventh syntax elements, carried in the slice header, are slice-level syntax elements. For example, the third syntax element could be the slice-level `sh_num_tf_aps_ids_minus1`, which can be found in the slice header's `slice_header()` function. The seventh syntax element could be the slice-level `sh_tf_aps_id`, which can also be found in the slice header's `slice_header()` function.
[0209] In this embodiment, the eighth syntax element is used to indicate the location of the control flag for the reference-position-based filtering technique. For example, a value of 1 for the eighth syntax element indicates that the control flag and other information for the reference-position-based filtering technique are in the image header; conversely, a value of 0 indicates that the control flag and other information for the reference-position-based filtering technique are in the video header. When this syntax element is not present in the bitstream, its default value is 0. However, this application does not impose any limitations on the value of the eighth syntax element. In one possible implementation, the eighth syntax element can be either pps_tf_info_in_ph_flag or pps_alf_info_in_ph_flag.
[0210] It's understandable that improvements to the syntax elements of the APS require corresponding improvements to the syntax elements of the image header and video header. This is mainly because the APS is a dataset independent of the image header and video header. Therefore, if the current image and video select a reference position-based filtering technique (i.e., TF technique or TF filtering technique), it's necessary to parse the bitstream to determine which TF_APS contains the TF filtering-related parameters. In one possible implementation, a bitstream can carry eight TF_APS.
[0211] Unlike related technologies, since the filtering parameters of TF are parsed and stored in APS, the acquisition of TF-related parameters for the current image or slice selected and used for filtering the current image or slice needs to be further determined from the APS index parsed from the image header and slice header. If the parsed APS index is 0, the relevant filtering parameters are obtained from the 0th TF_APS type APS; if the parsed APS index is 0 and 1, the filtering parameters obtained from the 0th and 1st TF_APS type APS will be used for filtering the current image or slice.
[0212] In one possible implementation, the TF identifier can be divided into image header and video header. It can reuse `pps_alf_info_in_ph_flag` to control whether the TF identifier and APS dataset index are in the image header or video header, or it can have its own independent identifier. For example, it can be controlled by `pps_tf_info_in_ph_flag`. If a separate identifier is used to control whether the TF identifier exists in the image header or video header, one possible implementation is shown in Table 5 below.
[0213] Table 5
[0214] The `pps_tf_info_in_ph_flag` parameter indicates whether information such as the TF control flag exists in the image header. A value of 1 indicates that the TF control flag is present in the image header, while a value of 0 indicates it exists in the slice header. The default value is 0 when this syntax element is not present in the bitstream.
[0215] The control flags and other information for encoding and decoding TFs in the image header and sequence header include enable flags, APS indicator flags, and filtering modes. The table below shows examples of control flags and other information for encoding and decoding TFs in the image header and sequence header.
[0216] When pps_tf_info_in_ph_flag is 0, the following table shows an example of the control flags for encoding and decoding TF in the title sequence.
[0217] Table 6
[0218] in:
[0219] The `sh_tf_enabled_flag` syntax element is a flag in the title indicating whether reference-based adaptive filtering is enabled. If this syntax element is not present in the bitstream, its value is inferred to be 0. A value of 1 indicates that reference-based adaptive filtering can be used in the current slice; a value of 0 indicates that reference-based adaptive filtering is disabled.
[0220] The `sh_tf_reuse_flag` syntax element indicates whether the current slice's encoding / decoding reuses historical filters. A value of 1 indicates reuse of historical filters, while a value of 0 indicates no reuse. When this flag is not present in the bitstream, its value should be equal to `ph_tf_reuse_flag`. In some cases, `sh_tf_new_flag` can also be used to represent a similar meaning; for example, `sh_tf_new_flag` of 1 means the current slice does not reuse historical filters, and `sh_tf_new_flag` of 0 means it reuses historical filters.
[0221] The `sh_tf_filter_mode` syntax element indicates which reference image-based adaptive filtering technique mode is used for the current slice. Values from 0 to 5 indicate which of the six filtering modes (one-way, two-way, etc.) is used. If this syntax element is not present in the bitstream, its value should be equal to `ph_tf_filter_mode`.
[0222] It should be understood that, unlike related technologies, the slice-level approach introduces indicative syntax elements related to TF_APS, used to indicate from which TF_APS the relevant parameter information for filtering is obtained, wherein:
[0223] `sh_num_tf_aps_ids_minus1` indicates that the filtering parameter information used in the current chip comes from the number of TF_APS minus 1. Since there are a maximum of 8 TF_APS in this invention, the value range of this syntax element should be 0 to 7. When this syntax element does not exist in the bitstream, its value should be equal to `ph_num_tf_aps_ids_minus1`.
[0224] `sh_tf_aps_id` is the index of the TF_APS where the specific filter parameter exists, and its value ranges from 0 to 7. When this syntax element does not exist in the bitstream, its value should be equal to `ph_tf_aps_id`.
[0225] When pps_tf_info_in_ph_flag is 1, the following table 7 shows an example of the control flags for encoding and decoding TF in the image header.
[0226] Table 7
[0227] The meaning of each syntax element in Table 7 is similar to that of the title sequence, the only difference being their scope. The title sequence's scope is the current title sequence, while the image header's scope is the current image. Due to the influence of `pps_tf_info_in_ph_flag`, when the parsing of the corresponding syntax element is performed in the image header, the parsing of the title sequence syntax element will be skipped. In this case, the value of the title sequence syntax element should be equal to the value of the corresponding syntax element parsed from the image header. When the bitstream does not include the above image-level flags, their values should default to 0.
[0228] It is understandable that before obtaining the reference position-based filter-related parameters of the current block from the APS, these filter-related parameters need to be parsed from the bitstream and stored in the APS cache.
[0229] That is, in some embodiments, the method further includes: parsing the ninth and tenth syntax elements in the bitstream; wherein the ninth syntax element is used to indicate the index of the currently carried APS in the bitstream; the tenth syntax element is used to indicate the type of the currently carried APS in the bitstream; when the value of the tenth syntax element indicates that the type of the APS is a reference position-based filtering type, parsing the filter-related parameters of the currently carried reference position-based filter in the bitstream; and according to the value of the ninth syntax element, saving the corresponding parsed filter-related parameters of the reference position-based filter to the APS at the corresponding index.
[0230] In one possible implementation, the ninth syntax element could be `aps_adaptation_parameter_set_id`. For example, suppose there are a maximum of eight APS of type TF_APS (i.e., APS of reference position-based filtering type). That is, when the APS type is TF_APS, the value of `aps_adaptation_parameter_set_id` is between 0 and 7, used to identify the APS dataset with `aps_params_type` of TF_APS parsed from the bitstream.
[0231] Furthermore, in some embodiments, obtaining the corresponding APS based on the value of the parsed seventh syntax element includes: obtaining the corresponding APS from a pre-stored APS based on the value of the parsed seventh syntax element; wherein, the seventh syntax element is used to indicate the index of the APS used by the current slice or the current image; for example, the seventh syntax element may be sh_tf_aps_id or ph_tf_aps_id.
[0232] In some embodiments, the filter-related parameters of the filter based on the reference position currently carried in the parsed bitstream include:
[0233] Parse one or more of the following syntax elements currently carried in the bitstream to determine the filter-related parameters of the currently carried reference-position-based filter:
[0234] The eleventh syntax element; the eleventh syntax element is used to indicate the number of filters corresponding to the filter-related parameters in the current APS; the twelfth syntax element; the twelfth syntax element is used to indicate the absolute value of the filter coefficients in the current APS;
[0235] The thirteenth syntax element; the thirteenth syntax element is used to indicate the sign of the filter coefficients in the current APS;
[0236] The fourteenth syntax element; the fourteenth syntax element is used to indicate the precision of the filter coefficients in the current APS;
[0237] The fifteenth syntax element; the fifteenth syntax element is used to indicate the encoding method of the filter coefficients in the current APS;
[0238] The sixteenth syntax element; the sixteenth syntax element is used to indicate whether the filter in the current APS uses nonlinear truncation;
[0239] The seventeenth syntax element; the seventeenth syntax element is used to indicate the index of the nonlinear truncation used by the filter in the current APS.
[0240] It should be noted that, in the embodiments of this application, the current APS refers to the APS corresponding to the APS index indicated by the ninth syntax element, which is TF_APS.
[0241] For example, in one possible implementation, the eleventh syntax element could be `tf_num_filters_signalled_minus1`. For instance, assuming the current APS includes filter-related parameters for 3 filters, the value of the eleventh syntax element would be 2. That is, in some embodiments, the value of the eleventh syntax element is the number of filters corresponding to the filter-related parameters in the current APS minus 1.
[0242] For example, in one possible implementation, the twelfth syntax element could be tf_coeff_abs.
[0243] For example, in one possible implementation, the thirteenth syntax element could be tf_coeff_sign.
[0244] For example, in one possible implementation, the fourteenth syntax element can be tf_shift_minus6, and the value of the fourteenth syntax element is the precision of the filter coefficients in the current APS minus 1.
[0245] In some embodiments, the fifteenth syntax element is used to indicate the exponential Golomb k value used for encoding the filter coefficients of the filter in the current APS. Exemplarily, in one possible implementation, the fifteenth syntax element may be tf_k_order.
[0246] For example, in one possible implementation, the sixteenth syntax element could be tf_clip_flag.
[0247] For example, in one possible implementation, the seventeenth syntax element could be tf_clip_idx.
[0248] In one possible implementation, a new type of APS can be added. After the corresponding type is resolved, it indicates that this APS is an APS storing the filter-related parameters of the filter based on the reference position. The corresponding modifications are shown in Table 8 below.
[0249] Table 8
[0250] In Table 8 above, "if(aps_params_type != TF_APS)" and "else if(aps_params_type == TF_APS) tf_data()" represent improvements in this application's embodiment. The first improvement is that `aps_chroma_present_flag` is only parsed when the current APS type is not `TF_APS`. This is because current TF filtering technology primarily filters the luminance component, and the index `TF_APS` does not need to know whether chrominance exists. The second improvement is that when the current APS type is `TF_APS`, the corresponding adaptive filtering parameters are further parsed. For example, `aps_params_type == TF_APS`; for example, `TF_APS = 3`.
[0251] When the current APS type is TF_APS, the specific adaptive parameter to be parsed is tf_data(). The content of tf_data() is basically the same as that in the slice header in Table 2, including the following parameters:
[0252] The number of filters in the current APS is reduced by 1 (tf_num_filters_signalled_minus1). It should be understood that an APS can have one or more filter-related parameters; this syntax element is used to indicate the number of filters in the APS.
[0253] The filter coefficients of the current set of filters stored in the APS are reduced by 6 (tf_shift_minus6). This syntax element is used to indicate the amplification factor of the filter coefficients. Here, subtracting 6 means multiplying the actual filter coefficients by 2 to the power of 6, that is, shifting them left by 6 bits. At the decoding end, the filter coefficients obtained by parsing the bitstream need to be divided by 2 to the power of 6 to obtain the actual filter coefficients, and filtering is performed based on this.
[0254] The filter coefficients of this set of filters currently stored in the APS are encoded using the exponential Golomb k-value tf_k_order;
[0255] Does the filter set currently stored in APS use the nonlinear tf_clip_flag?
[0256] The absolute values of the filter coefficients tf_coeff_abs of this set of filters currently stored in APS;
[0257] The sign of the filter coefficients (tf_coeff_sign) of this set of filters currently stored in the APS;
[0258] The nonlinear truncation index tf_clip_idx used by the filters in the current APS is stored in the APS. This syntax element is used to limit the upper or lower limit of the input information (reconstructed value) of the filter, that is, to limit the upper or lower limit of the input to the reconstructed value of the filter.
[0259] It should be noted that tf_data() can be understood as an example of filter-related parameters based on the reference position. tf_data() is shown in Table 9 below.
[0260] Table 9
[0261] The specific parameters of the filter coefficients in APS can be constructed using the above syntax elements.
[0262] In one possible implementation, there can be a maximum of 8 APS of type TF_APS. That is, when the APS type is TF_APS, the aps_adaptation_parameter_set_id is between 0 and 7, which is used to store the APS dataset with aps_params_type of TF_APS parsed from the bitstream. The data of TF_APS is stored in the TF_APS with the corresponding index of the value of aps_adaptation_parameter_set_id.
[0263] It is understandable that in related technologies, such as the filters shown in Figures 6-9, unidirectional and bidirectional filters each have two filtering shapes. Up to eight unidirectional filters and eight bidirectional filters can be cached as historical filters for use as candidates for filtering the current image. In these related technologies, when reusing historical filters, if the `tf_filter_mode` parsed from the current image / slice is in unidirectional mode, then only filters from the historical unidirectional filter cache can be selected and reused; if the `tf_filter_mode` parsed from the current image / slice is in bidirectional mode, then only filters from the historical bidirectional filter cache can be selected and reused.
[0264] In the embodiments of this application, the shape of the filter corresponding to the pre-stored APS (e.g., TF_APS) is not limited, nor is it limited whether the shapes of the filters corresponding to different APS are the same, nor is it limited whether the shapes of multiple filters in the same APS are the same.
[0265] In one possible implementation, the filters corresponding to the pre-stored APS (here referring to TF_APS) have the same shape, and the filter correlation parameters of the filter based on the reference position in each APS are used for unidirectional and bidirectional filtering.
[0266] It should be understood that in this application, the APS that includes filter-related parameters of the filter based on the reference position is referred to as TF_APS.
[0267] It is understood that in this embodiment, both unidirectional and bidirectional filters exist in the APS dataset of type TF_APS. Instead of each of the unidirectional and bidirectional filtering modes storing 8 sets of filters, a total of 8 sets of filters are cached across 8 APS for the current image candidate. This reduces the number of choices and combinations for the codec, affecting the compression performance of the codec. Therefore, in this embodiment, it is further proposed to integrate the shapes of unidirectional and bidirectional filters, enabling mutual reuse between them. This means that the filtering coefficients in the unidirectional filtering mode can be reused in the bidirectional filtering mode, and vice versa.
[0268] Step 1303: Filter the reconstructed value of the current position based on the reference position of the current position and the filter coefficients of the filter of the current block.
[0269] Furthermore, in some embodiments, step 1303 may include: determining the filtering mode of the current position; and filtering the reconstructed value of the current position according to the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block.
[0270] In some embodiments, determining the filtering mode for the current position includes: if it is determined that the current image or current slice uses a filtering technique based on a reference position, parsing the eighteenth syntax element in the bitstream to determine the filtering mode for the current position (the filtering mode indicates a filtering mode based on a reference position).
[0271] In one possible implementation, the bitstream can indicate whether the current image or slice uses a reference position-based filtering technique. For example, `ph_tf_enabled_flag` could indicate whether the current image uses a reference position-based filtering technique. Similarly, `sh_tf_enabled_flag` could indicate whether the current slice uses a reference position-based filtering technique.
[0272] In one possible implementation, the eighteenth syntax element can be `tf_filter_mode`. For example, if there are six filter modes based on the reference position, then the value of the eighteenth syntax element could be 0 to 5.
[0273] In some embodiments, the eighteenth syntax element is used to indicate the filtering mode of the current image or the current slice. In the embodiments of this application, there is no limitation on the type of filtering mode based on the reference position indicated by the eighteenth syntax element, which can be any one or more of the first to eighth modes described below.
[0274] In one possible implementation, the filtering mode of the current position is a first mode, which is a bidirectional filtering mode based on inter-frame prediction; the reference position of the current position includes a first reference position and a second reference position; wherein, the first reference position is the position determined by the first MV of the current position on the forward inter-frame reference image, and the second reference position is the position determined by the second MV of the current position on the backward inter-frame reference image.
[0275] Accordingly, in some embodiments, filtering the reconstructed value at the current position based on the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block includes: when the filtering mode of the current position is a first mode, obtaining first input information of the filter of the current block based on the first reference position, the forward inter-frame reference image, and the shape of the filter of the current block; obtaining second input information of the filter of the current block based on the second reference position, the backward inter-frame reference image, and the shape of the filter of the current block; wherein the filter corresponding to the first input information and the second input information is the same filter; determining third input information based on the first input information and the second input information; and filtering the reconstructed value at the current position based on the third input information and the filtering coefficients of the filter of the current block.
[0276] Furthermore, in some embodiments, obtaining the first input information of the filter of the current block based on the first reference position, the forward inter-frame reference image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the forward inter-frame reference image as the first reference region, with the first reference position as the center position of the filter of the current block, and using the reconstructed value of the first reference region as the first input information.
[0277] Furthermore, in some embodiments, obtaining the second input information of the filter of the current block based on the second reference position, the backward inter-frame reference image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the backward inter-frame reference image as the second reference region, with the second reference position as the center position of the filter of the current block, and using the reconstructed value of the second reference region as the second input information.
[0278] Furthermore, in some embodiments, determining the third input information based on the first input information and the second input information includes: performing a weighted average operation or an average operation on the reconstructed values at corresponding positions in the first input information and the second input information to obtain the third input information.
[0279] In another possible implementation, the filtering mode for the current position is the second mode, which is a bidirectional filtering mode based on the co-location; the reference positions for the current position include a third reference position and a fourth reference position; wherein, the third reference position is the co-location of the current position on the nearest forward reference image of the current image, and the fourth reference position is the co-location of the current position on the nearest backward reference image of the current image.
[0280] Accordingly, in some other embodiments, filtering the reconstructed value at the current position based on the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block includes: when the filtering mode of the current position is the second mode, obtaining the fourth input information of the filter of the current block based on the third reference position, the nearest forward reference image, and the shape of the filter of the current block; obtaining the fifth input information of the filter of the current block based on the fourth reference position, the nearest backward reference image, and the shape of the filter of the current block; wherein the filter corresponding to the fourth input information and the fifth input information is the same filter; determining the sixth input information based on the fourth input information and the fifth input information; and filtering the reconstructed value at the current position based on the sixth input information and the filtering coefficients of the filter of the current block.
[0281] Furthermore, in some embodiments, obtaining the fourth input information of the filter of the current block based on the third reference position, the forward nearest reference image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the forward nearest reference image as the third reference region, with the third reference position as the center position of the filter of the current block, and using the reconstructed value of the third reference region as the fourth input information.
[0282] Furthermore, in some embodiments, obtaining the fifth input information of the filter of the current block based on the fourth reference position, the backward nearest reference image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the backward nearest reference image as the fourth reference region, with the fourth reference position as the center position of the filter of the current block, and using the reconstructed value of the fourth reference region as the fifth input information.
[0283] Furthermore, in some embodiments, determining the sixth input information based on the fourth and fifth input information includes: performing a weighted average operation or an average operation on the reconstructed values at corresponding positions in the fourth and fifth input information to obtain the sixth input information.
[0284] In some other embodiments, filtering the reconstructed value at the current position based on the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block includes: when the filtering mode of the current position is any one of the third to sixth modes, obtaining the seventh input information of the filter of the current block based on the reference position of the current position, the image where the reference position is located, and the shape of the filter of the current block; wherein the third to sixth modes are all unidirectional filtering modes; and filtering the reconstructed value at the current position based on the seventh input information and the filtering coefficients of the filter of the current block.
[0285] Furthermore, in some embodiments, obtaining the seventh input information of the filter of the current block based on the reference position of the current position, the image where the reference position is located, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the image where the reference position is located as the fifth reference region, with the reference position as the center position of the filter of the current block, and using the reconstructed value of the fifth reference region as the fifth input information.
[0286] In another possible implementation, the filtering mode of the current position is the seventh mode, which is a bidirectional filtering mode based on IBC prediction; the reference positions of the current position include a fifth reference position and a sixth reference position; wherein, the fifth reference position is the position determined on the current image based on the first BV of the current position, and the sixth reference position is the position determined on the current image based on the second BV of the current position.
[0287] Accordingly, the step of filtering the reconstructed value at the current position based on the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block includes: when the filtering mode of the current position is the seventh mode, obtaining the eighth input information of the filter of the current block based on the fifth reference position, the current image, and the shape of the filter of the current block; obtaining the ninth input information of the filter of the current block based on the sixth reference position, the current image, and the shape of the filter of the current block; wherein the filter corresponding to the eighth input information and the ninth input information is the same filter; determining the tenth input information based on the eighth input information and the ninth input information; and filtering the reconstructed value at the current position based on the tenth input information and the filtering coefficients of the filter of the current block.
[0288] Furthermore, in some embodiments, obtaining the eighth input information of the filter of the current block based on the fifth reference position, the current image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block in the current image as the sixth reference region with the fifth reference position as the center position of the filter of the current block, and using the reconstructed value of the sixth reference region as the eighth input information.
[0289] Furthermore, in some embodiments, obtaining the ninth input information of the filter of the current block based on the sixth reference position, the current image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block in the current image as the seventh reference region with the sixth reference position as the center position of the filter of the current block, and using the reconstructed value of the seventh reference region as the ninth input information.
[0290] Furthermore, in some embodiments, determining the tenth input information based on the eighth and ninth input information includes: performing a weighted average operation or an average operation on the reconstructed values at corresponding positions in the eighth and ninth input information to obtain the tenth input information.
[0291] In another possible implementation, the filtering mode of the current position is the eighth mode, which is a bidirectional filtering mode based on IBC prediction and inter-frame prediction; the reference positions of the current position include the seventh reference position and the eighth reference position; wherein, the seventh reference position is the position determined on the current image based on the third BV of the current position, and the eighth reference position is the position determined on the corresponding reference image based on the third MV of the current position.
[0292] Accordingly, the step of filtering the reconstructed value at the current position based on the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block includes: when the filtering mode of the current position is the eighth mode, obtaining the eleventh input information of the filter of the current block based on the seventh reference position, the current image, and the shape of the filter of the current block; obtaining the twelfth input information of the filter of the current block based on the eighth reference position, the reference image corresponding to the third MV, and the shape of the filter of the current block; wherein the filters corresponding to the eleventh and twelfth input information are the same filter; determining the thirteenth input information based on the eleventh and twelfth input information; and filtering the reconstructed value at the current position based on the thirteenth input information and the filtering coefficients of the filter of the current block.
[0293] Furthermore, in some embodiments, obtaining the eleventh input information of the filter of the current block based on the seventh reference position, the current image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block in the current image as the eighth reference region with the seventh reference position as the center position of the filter of the current block, and using the reconstructed value of the eighth reference region as the eleventh input information.
[0294] Furthermore, in some embodiments, obtaining the twelfth input information of the filter of the current block based on the eighth reference position, the reference image corresponding to the third MV, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the reference image corresponding to the third MV as the ninth reference region with the eighth reference position as the center position of the filter of the current block, and using the reconstructed value of the ninth reference region as the twelfth input information.
[0295] Furthermore, in some embodiments, determining the thirteenth input information based on the eleventh and twelfth input information includes: performing a weighted average operation or an average operation on the reconstructed values at corresponding positions in the eleventh input information to obtain the thirteenth input information.
[0296] In some embodiments, if the current block satisfies a first condition, filtering of the reconstructed value at the current position is skipped;
[0297] The first condition includes one or more of the following conditions:
[0298] (1) The inter-frame prediction mode is not selected in the prediction mode of the current block;
[0299] (2) The IBC prediction mode is not selected in the prediction mode of the current block;
[0300] (3) The values of the residual samples in the current block are all 0;
[0301] (4) The prediction mode of the current block is the inter-frame prediction mode, and the horizontal displacement component in the motion vector used is greater than or equal to the first threshold and / or the vertical displacement component in the motion vector used is greater than or equal to the second threshold.
[0302] (5) The prediction mode of the current block is IBC prediction mode, and the horizontal displacement component in the block vector used is greater than or equal to the third threshold and / or the vertical displacement component in the motion vector used is greater than or equal to the fourth threshold.
[0303] It is understood that in this embodiment, the filtering parameters of the reference position-based filter are placed in the APS (Aspect-Specific Filtering System) rather than in the image header or slice header. Furthermore, by integrating the shapes of unidirectional and bidirectional filters, unidirectional and bidirectional filtering can be reused. Therefore, the number of combinations selected by the encoder increases, and the range of options expands. Consequently, the probability of the current block selecting the reference position-based filtering technique increases. This may lead to the decoder needing to use the reference position-based filtering technique on more coding tree units, increasing decoding time. To address this issue, in some embodiments, when one or more of the above conditions are met, even if the coding tree unit to which the current position belongs selects the TF (Temporal Filtering) mode, filtering at the current position can be skipped, thereby reducing complexity.
[0304] This application provides an encoding method applied to an encoder.
[0305] Figure 15 is a schematic diagram of the implementation flow of the encoding method provided in the embodiment of this application; as shown in Figure 15, the method includes the following steps 1501 to 1503:
[0306] Step 1501: Determine the filtering correlation parameters of the current block. These filtering correlation parameters are the filtering correlation parameters of the filter based on the reference position. The reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image.
[0307] Step 1502: Determine the filter coefficients of the current block based on the filter-related parameters of the current block;
[0308] Step 1503: Filter the reconstructed value of the current position based on the reference position of the current position and the filter coefficients of the filter of the current block.
[0309] It is understood that in this embodiment, the filter-related parameters of the reference position-based filter, originally stored at the image or slice level, are moved to the APS (Application Performance Filter). Previously, regardless of whether these filter-related parameters were encoded or decoded at the image or slice level, corresponding caches were needed to store the parsed coefficients. However, the inventors of this application discovered that the updates and parsing of these filter-related parameters are not actually that frequent and do not require updates at the image or slice level. Therefore, placing these filter-related parameters in the APS reduces the cache size at the slice and image levels, saving cache resources.
[0310] In some embodiments, step 1503, filtering the reconstructed value of the current position based on the reference position of the current position and the filtering coefficients of the filter of the current block, may further include: determining the filtering mode of the current position; and filtering the reconstructed value of the current position based on the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block.
[0311] In some embodiments, the encoding method further includes: determining the filtering mode of the current image or current slice when it is determined that the current image or current slice uses a reference position-based filtering technique; determining the value of the eighteenth syntax element according to the filtering mode of the current image or current slice; encoding the value of the eighteenth syntax element and writing the obtained encoded bits into the bitstream.
[0312] In one possible implementation, the eighteenth syntax element can be `tf_filter_mode`. For example, if there are six filter modes based on the reference position, then the value of the eighteenth syntax element could be 0 to 5.
[0313] In some embodiments, the eighteenth syntax element is used to indicate the filtering mode of the current image or the current slice. In the embodiments of this application, there is no limitation on the type of filtering mode based on the reference position indicated by the eighteenth syntax element, which can be any one or more of the first to eighth modes described below.
[0314] In one possible implementation, the filtering mode of the current position is a first mode, which is a bidirectional filtering mode based on inter-frame prediction; the reference position of the current position includes a first reference position and a second reference position; wherein, the first reference position is the position determined by the first MV of the current position on the forward inter-frame reference image, and the second reference position is the position determined by the second MV of the current position on the backward inter-frame reference image.
[0315] Accordingly, in some embodiments, filtering the reconstructed value at the current position based on the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block includes: when the filtering mode of the current position is a first mode, obtaining first input information of the filter of the current block based on the first reference position, the forward inter-frame reference image, and the shape of the filter of the current block; obtaining second input information of the filter of the current block based on the second reference position, the backward inter-frame reference image, and the shape of the filter of the current block; wherein the filter corresponding to the first input information and the second input information is the same filter; determining third input information based on the first input information and the second input information; and filtering the reconstructed value at the current position based on the third input information and the filtering coefficients of the filter of the current block.
[0316] Furthermore, in some embodiments, obtaining the first input information of the filter of the current block based on the first reference position, the forward inter-frame reference image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the forward inter-frame reference image as the first reference region, with the first reference position as the center position of the filter of the current block, and using the reconstructed value of the first reference region as the first input information.
[0317] Furthermore, in some embodiments, obtaining the second input information of the filter of the current block based on the second reference position, the backward inter-frame reference image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the backward inter-frame reference image as the second reference region, with the second reference position as the center position of the filter of the current block, and using the reconstructed value of the second reference region as the second input information.
[0318] Furthermore, in some embodiments, determining the third input information based on the first input information and the second input information includes: performing a weighted average operation or an average operation on the reconstructed values at corresponding positions in the first input information and the second input information to obtain the third input information.
[0319] In another possible implementation, the filtering mode for the current position is the second mode, which is a bidirectional filtering mode based on the co-location; the reference positions for the current position include a third reference position and a fourth reference position; wherein, the third reference position is the co-location of the current position on the nearest forward reference image of the current image, and the fourth reference position is the co-location of the current position on the nearest backward reference image of the current image.
[0320] Accordingly, in some other embodiments, filtering the reconstructed value at the current position based on the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block includes: when the filtering mode of the current position is the second mode, obtaining the fourth input information of the filter of the current block based on the third reference position, the nearest forward reference image, and the shape of the filter of the current block; obtaining the fifth input information of the filter of the current block based on the fourth reference position, the nearest backward reference image, and the shape of the filter of the current block; wherein the filter corresponding to the fourth input information and the fifth input information is the same filter; determining the sixth input information based on the fourth input information and the fifth input information; and filtering the reconstructed value at the current position based on the sixth input information and the filtering coefficients of the filter of the current block.
[0321] Furthermore, in some embodiments, obtaining the fourth input information of the filter of the current block based on the third reference position, the forward nearest reference image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the forward nearest reference image as the third reference region, with the third reference position as the center position of the filter of the current block, and using the reconstructed value of the third reference region as the fourth input information.
[0322] Furthermore, in some embodiments, obtaining the fifth input information of the filter of the current block based on the fourth reference position, the backward nearest reference image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the backward nearest reference image as the fourth reference region, with the fourth reference position as the center position of the filter of the current block, and using the reconstructed value of the fourth reference region as the fifth input information.
[0323] Furthermore, in some embodiments, determining the sixth input information based on the fourth and fifth input information includes: performing a weighted average operation or an average operation on the reconstructed values at corresponding positions in the fourth and fifth input information to obtain the sixth input information.
[0324] In some other embodiments, filtering the reconstructed value at the current position based on the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block includes: when the filtering mode of the current position is any one of the third to sixth modes, obtaining the seventh input information of the filter of the current block based on the reference position of the current position, the image where the reference position is located, and the shape of the filter of the current block; wherein the third to sixth modes are all unidirectional filtering modes; and filtering the reconstructed value at the current position based on the seventh input information and the filtering coefficients of the filter of the current block.
[0325] Furthermore, in some embodiments, obtaining the seventh input information of the filter of the current block based on the reference position of the current position, the image where the reference position is located, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the image where the reference position is located as the fifth reference region, with the reference position as the center position of the filter of the current block, and using the reconstructed value of the fifth reference region as the fifth input information.
[0326] In another possible implementation, the filtering mode of the current position is the seventh mode, which is a bidirectional filtering mode based on IBC prediction; the reference positions of the current position include a fifth reference position and a sixth reference position; wherein, the fifth reference position is the position determined on the current image based on the first BV of the current position, and the sixth reference position is the position determined on the current image based on the second BV of the current position.
[0327] Accordingly, the step of filtering the reconstructed value at the current position based on the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block includes: when the filtering mode of the current position is the seventh mode, obtaining the eighth input information of the filter of the current block based on the fifth reference position, the current image, and the shape of the filter of the current block; obtaining the ninth input information of the filter of the current block based on the sixth reference position, the current image, and the shape of the filter of the current block; wherein the filter corresponding to the eighth input information and the ninth input information is the same filter; determining the tenth input information based on the eighth input information and the ninth input information; and filtering the reconstructed value at the current position based on the tenth input information and the filtering coefficients of the filter of the current block.
[0328] Furthermore, in some embodiments, obtaining the eighth input information of the filter of the current block based on the fifth reference position, the current image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block in the current image as the sixth reference region with the fifth reference position as the center position of the filter of the current block, and using the reconstructed value of the sixth reference region as the eighth input information.
[0329] Furthermore, in some embodiments, obtaining the ninth input information of the filter of the current block based on the sixth reference position, the current image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block in the current image as the seventh reference region with the sixth reference position as the center position of the filter of the current block, and using the reconstructed value of the seventh reference region as the ninth input information.
[0330] Furthermore, in some embodiments, determining the tenth input information based on the eighth and ninth input information includes: performing a weighted average operation or an average operation on the reconstructed values at corresponding positions in the eighth and ninth input information to obtain the tenth input information.
[0331] In another possible implementation, the filtering mode of the current position is the eighth mode, which is a bidirectional filtering mode based on IBC prediction and inter-frame prediction; the reference positions of the current position include the seventh reference position and the eighth reference position; wherein, the seventh reference position is the position determined on the current image based on the third BV of the current position, and the eighth reference position is the position determined on the corresponding reference image based on the third MV of the current position.
[0332] Accordingly, the step of filtering the reconstructed value at the current position based on the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block includes: when the filtering mode of the current position is the eighth mode, obtaining the eleventh input information of the filter of the current block based on the seventh reference position, the current image, and the shape of the filter of the current block; obtaining the twelfth input information of the filter of the current block based on the eighth reference position, the reference image corresponding to the third MV, and the shape of the filter of the current block; wherein the filters corresponding to the eleventh and twelfth input information are the same filter; determining the thirteenth input information based on the eleventh and twelfth input information; and filtering the reconstructed value at the current position based on the thirteenth input information and the filtering coefficients of the filter of the current block.
[0333] Furthermore, in some embodiments, obtaining the eleventh input information of the filter of the current block based on the seventh reference position, the current image, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block in the current image as the eighth reference region with the seventh reference position as the center position of the filter of the current block, and using the reconstructed value of the eighth reference region as the eleventh input information.
[0334] Furthermore, in some embodiments, obtaining the twelfth input information of the filter of the current block based on the eighth reference position, the reference image corresponding to the third MV, and the shape of the filter of the current block includes: determining the area covered by the filter of the current block on the reference image corresponding to the third MV as the ninth reference region with the eighth reference position as the center position of the filter of the current block, and using the reconstructed value of the ninth reference region as the twelfth input information.
[0335] Furthermore, in some embodiments, determining the thirteenth input information based on the eleventh and twelfth input information includes: performing a weighted average operation or an average operation on the reconstructed values at corresponding positions in the eleventh input information to obtain the thirteenth input information.
[0336] In some embodiments, filtering of the reconstructed value at the current position is skipped if the current block satisfies a first condition; wherein the first condition includes one or more of the following conditions:
[0337] Inter-frame prediction mode is not selected for the current block;
[0338] The IBC prediction mode is not selected for the current block.
[0339] The residual samples for the current block all have a value of 0;
[0340] The prediction mode of the current block is inter-frame prediction mode, and the horizontal displacement component in the motion vector used is greater than or equal to the first threshold and / or the vertical displacement component in the motion vector used is greater than or equal to the second threshold.
[0341] The current block's prediction mode is IBC prediction mode, and the horizontal displacement component in the block vector used is greater than or equal to the third threshold and / or the vertical displacement component in the motion vector used is greater than or equal to the fourth threshold.
[0342] In some embodiments, the encoding method further includes: determining the values of a first syntax element and a second syntax element; wherein the first syntax element is used to indicate the index of the APS used for filtering in the current block in the first APS candidate list; the second syntax element is used to indicate the index of the filtering-related parameters of the current block in the APS indicated by the first syntax element; encoding the values of the first syntax element and the second syntax element, and writing the obtained encoded bits into the bitstream.
[0343] Furthermore, in some embodiments, determining the values of the first syntax element and the second syntax element includes: determining the number of APSs used by the current slice or the current image (where APS refers to TF_APS); if the number of APSs used by the current slice or the current image is greater than 1, determining the values of the first syntax element and the second syntax element; wherein the APS includes filtering-related parameters of one or more filters.
[0344] In some embodiments, the encoding method further includes: determining the value of a third syntax element based on the number of APSs used in the current slice or the current image; wherein the third syntax element is used to indicate the number of APSs used in the current slice or the current image; encoding the value of the third syntax element and writing the obtained encoded bits into the bitstream.
[0345] Furthermore, in some embodiments, determining the values of the first syntax element and the second syntax element when the number of APS used in the current slice or the current image is greater than 1 includes: determining the values of the first syntax element and the second syntax element when it is determined that the current block uses a reference position-based filtering technique and the number of APS used in the current slice or the current image is greater than 1.
[0346] In some embodiments, the encoding method further includes: determining the value of the fourth syntax element based on a reference position-based filtering technique for the current block; encoding the value of the fourth syntax element; and writing the obtained encoded bits into the bitstream.
[0347] In some embodiments, the encoding method further includes: determining whether the current block uses a reference position-based filtering technique if it is determined that the current slice or the current image is filtered using a reference position-based filter.
[0348] In some embodiments, the encoding method further includes: filtering the current slice or current image using a reference position-based filter to determine the value of the fifth syntax element; encoding the value of the fifth syntax element and writing the obtained encoded bits into the bitstream.
[0349] In some embodiments, the encoding method further includes: determining the index of at least one APS used in the current slice or the current image; and constructing a first APS candidate list based on the index of at least one APS used in the current slice or the current image and the corresponding APS.
[0350] In some embodiments, the encoding method further includes: determining the value of a seventh syntax element based on the index of at least one APS used by the current slice or the current image; wherein the seventh syntax element is used to indicate the index of the APS used by the current slice or the current image; encoding the value of the seventh syntax element and writing the obtained encoded bits into the bitstream.
[0351] In some embodiments, the control identifier of the reference position-based filtering technique includes a third syntax element and a seventh syntax element; the encoding method further includes: when the control identifier exists in the image header of the bitstream, the value of the eighth syntax element is a first value, and the value of the eighth syntax element is encoded, and the obtained encoded bits are written into the bitstream; or, when the control identifier exists in the chip header of the bitstream, the value of the eighth syntax element is a second value, and the value of the eighth syntax element is encoded, and the obtained encoded bits are written into the bitstream; or, when the control identifier exists in the chip header of the bitstream, the location of the control identifier is not indicated in the bitstream.
[0352] In some embodiments, the encoding method further includes: determining the values of a ninth syntax element and a tenth syntax element; wherein the ninth syntax element is used to indicate the index of the APS currently carried in the bitstream; the tenth syntax element is used to indicate the type of the APS currently carried in the bitstream; encoding the values of the ninth syntax element and the tenth syntax element, and writing the obtained encoded bits into the bitstream.
[0353] In some embodiments, the encoding method further includes: determining one or more of the following syntax elements based on the filter correlation parameters of the filter based on the reference position in the APS corresponding to the value of the ninth syntax element:
[0354] The eleventh syntax element; the eleventh syntax element is used to indicate the number of filters corresponding to the filter-related parameters in the current APS;
[0355] The twelfth syntax element; the twelfth syntax element is used to indicate the absolute value of the filter coefficients in the current APS;
[0356] The thirteenth syntax element; the thirteenth syntax element is used to indicate the sign of the filter coefficients in the current APS;
[0357] The fourteenth syntax element; the fourteenth syntax element is used to indicate the precision of the filter coefficients in the current APS;
[0358] The fifteenth syntax element; the fifteenth syntax element is used to indicate the encoding method of the filter coefficients in the current APS;
[0359] The sixteenth syntax element; the sixteenth syntax element is used to indicate whether the filter in the current APS uses nonlinear truncation;
[0360] The seventeenth syntax element; the seventeenth syntax element is used to indicate the index of the nonlinear truncation used by the filter in the current APS;
[0361] The values of one or more syntax elements from the eleventh to the seventeenth syntax elements are encoded, and the resulting encoded bits are written into the bitstream.
[0362] In some embodiments, the filters corresponding to the reference position in the APS indicated in the bitstream have the same shape, and the filter-related parameters of the reference position-based filters in each APS are used for unidirectional filtering and bidirectional filtering.
[0363] It should be noted that, at the encoding end, the values of one or more of the above syntax elements can be determined using a rate-distortion optimization algorithm.
[0364] The descriptions of the above encoding method embodiments are similar to those of the above decoding method embodiments, and have similar beneficial effects. For technical details not disclosed in the encoding method embodiments of this application, please refer to the descriptions of the decoding method embodiments of this application for understanding.
[0365] The following examples illustrate possible implementation schemes of the encoding and decoding methods described in one or more of the above embodiments.
[0366] The following describes the improvements at the APS level.
[0367] It is understood that in VVC and ECM, the APS, as an adaptive dataset, can be used to transmit different types of adaptive parameters. In this embodiment, the filter-related parameters of the filter based on the reference position are transmitted as part of the APS; this type of APS is referred to as TF_APS. The filter-related parameters of the filter based on the reference position may include, but are not limited to, one or more of the following: the number of filters, the coefficients of the filters, the adaptive accuracy of the filters, and the nonlinear cutoff value of the filters. In addition, to accommodate the newly added filter-related parameters of the filter based on the reference position in the APS, a new type of APS is also allocated; this new APS type is referred to as TF_TYPE.
[0368] As can be understood, an APS is a collection used to store adaptive parameters. It is a data type independent of the image or slice. The data stored in this APS includes several types: TF parameters, ALF parameters, CCALF parameters, quantization parameters, etc.
[0369] In this embodiment, a new APS type, TF_TYPE, is introduced for the APS in VCC or ECM. When the parsed APS type is TF_TYPE, the corresponding APS should include filter-related parameters based on the reference position. The APS type is parsed first in the bitstream; after determining the APS type, the corresponding parameters are parsed based on that type.
[0370] Table 10 below is an example of parsing the syntax elements related to APS (adaptation_parameter_set_rbsp()) (this is the information contained in APS in VVC):
[0371] Table 10
[0372] In Table 10, the syntax elements in an APS include the dataset index of that APS:
[0373] The index is represented by the aps_adaptation_parameter_set_id syntax element. This index is used to represent the current APS. When a tool needs to select this APS, it can use this index to find the APS and then obtain the adaptive parameters in the APS.
[0374] The syntax elements in the APS further include the type of this APS:
[0375] The type is represented by the aps_params_type syntax element. APS types include ALF_APS, LMCS_APS, and SCALING_APS, which are used to indicate which tool the adaptive parameters in this APS belong to.
[0376] The syntax elements in the APS further include whether the APS contains a chroma component:
[0377] Whether a chroma component is included is indicated by the aps_chroma_present_flag syntax element.
[0378] In one possible implementation, a new type of APS can be added. After the corresponding type is resolved, it indicates that this APS is an APS storing the filter-related parameters of the filter based on the reference position. See Table 8 for the corresponding modifications.
[0379] It should be noted that tf_data() can be understood as an example of filter-related parameters based on the reference position. See Table 9 for tf_data().
[0380] The following describes the improvements to the image header and title sequence.
[0381] It's understandable that improvements to the syntax elements of the APS require corresponding improvements to the syntax elements of the image header and video header. This is mainly because the APS is a dataset independent of the image header and video header. Therefore, if the current image and video select a reference position-based filtering technique (i.e., TF technique or TF filtering technique), it's necessary to parse the bitstream to determine which TF_APS contains the TF filtering-related parameters. In one possible implementation, a bitstream can carry eight TF_APS.
[0382] Unlike related technologies, since the filtering parameters of TF are parsed and stored in APS, the acquisition of TF-related parameters for the current image or slice selected and used for filtering the current image or slice needs to be further determined from the APS index parsed from the image header and slice header. If the parsed APS index is 0, the relevant filtering parameters are obtained from the 0th TF_APS type APS; if the parsed APS index is 0 and 1, the filtering parameters obtained from the 0th and 1st TF_APS type APS will be used for filtering the current image or slice.
[0383] In one possible implementation, the TF identifier can be divided into image header and video header. It can reuse `pps_alf_info_in_ph_flag` to control whether the TF identifier and APS dataset index are in the image header or video header, or it can have its own independent identifier. For example, it can be controlled by `pps_tf_info_in_ph_flag`. If a separate identifier is used to control whether the TF identifier exists in the image header or video header, see the example shown in Table 5 for a possible implementation.
[0384] When pps_tf_info_in_ph_flag is 0, examples of information such as the control flag for encoding and decoding TF in the title sequence are shown in Table 6.
[0385] When pps_tf_info_in_ph_flag is 1, examples of information such as the control flag for encoding and decoding TF in the image header are shown in Table 7.
[0386] The meaning of each syntax element in Table 7 is similar to that of the title sequence, the only difference being their scope. The title sequence's scope is the current title sequence, while the image header's scope is the current image. Due to the influence of `pps_tf_info_in_ph_flag`, when the parsing of the corresponding syntax element is performed in the image header, the parsing of the title sequence syntax element will be skipped. In this case, the value of the title sequence syntax element should be equal to the value of the corresponding syntax element parsed from the image header. When the bitstream does not include the above image-level flags, their values should default to 0.
[0387] The following describes improvements at the unit level of the coding tree.
[0388] It is understandable that, since the filtering-related parameters of a filter are moved to the APS, the range of values for the filter index used to parse each coding tree unit should depend on which APS the filter exists in (the number of filters differs in different APSs). In one possible implementation, an example of a coding tree unit-level resolution determining whether the current coding tree luminance component uses a reference position-based filtering technique (i.e., TF filtering) and, if so, which filter to use, is shown in Table 11.
[0389] Table 11
[0390] Similar to the related technologies mentioned earlier, the `tf_ctb_enabled` flag (which is also an example of the fourth syntax element) is used to determine whether the current coding tree unit uses TF filtering (i.e., reference position-based filtering). A value of 1 indicates that it is used, otherwise it is not. If this syntax element does not exist in the bitstream, the flag is 0.
[0391] Unlike the related technologies mentioned earlier,
[0392] The `tf_setId` syntax element (an example of the first syntax element) is used to determine which TF_APS the selected filter group should be retrieved from when the current coding tree unit uses TF filtering. The index of the selected TF_APS should be equal to `sh_tf_aps_id[tf_setId]`, and the selected filter group is found through the TF_APS index. The value of `tf_setId` should range from 0 to `sh_num_tf_aps_ids_minus1`. Its value defaults to 0 when the bitstream does not contain this syntax element.
[0393] `tf_ctb_filter_idx` identifies which filter from a selected set of TF filters (i.e., reference position-based filters) will be used for filtering the current coding tree luma block. Specifically, it needs to further determine the selected filter coefficients, filter adaptive accuracy, filter shape, filter nonlinear cutoff value, etc., from the selected TF_APS. If there are N filters in the selected TF_APS, the value of this syntax element should be in the range of 0 to N-1.
[0394] Specifically, the selected filter coefficients should first be found in the TF_APS type APS dataset with index sh_tf_aps_id[tf_setId], and then the values in this dataset should be found based on the value of tf_ctb_filter_idx.
[0395] 1. The absolute value array of filter coefficients: tf_coeff_abs[tf_ctb_filter_idx]
[0396] 2. The filter coefficient sign array tf_coeff_sign[tf_ctb_filter_idx],
[0397] 3. Filter cutoff value index array tf_clip_idx[tf_ctb_filter_idx].
[0398] These information are used to construct the filter coefficients and cutoff values, which are then applied to the filtering of the current luminance block in the coding tree.
[0399] For example, each selected filter coefficient tfCoeff[i] is:
[0400] tfCoeff[i]=tf_coeff_sign[tf_ctb_filter_idx][i]==1?
[0401] -tf_coeff_abs[tf_ctb_filter_idx][i]*tf_coeff_abs[tf_ctb_filter_idx][i]
[0402] Where i is the index of each filtering position of the filter. For example, in a filter with 13 coefficients, the range of i is 0 to 12.
[0403] Each selected filter cutoff value tfClip[i] is:
[0404] tfClip[i]=1<<(7-tf_clip_idx[tf_ctb_filter_idx][i]*2+(inputBitdepth-8))
[0405] Where i is the index number of each stage of the filter. For example, in a filter with 13 coefficients, the range of i is 0 to 12.
[0406] The integration of filter shapes provided by embodiments of this application is described below.
[0407] It is understood that in related technologies, unidirectional and bidirectional filters each have two filtering shapes, and at most eight unidirectional filters and eight bidirectional filters can be cached as historical filters as candidates for filtering the current image. When reusing historical filters, if the tf_filter_mode parsed from the current image / slice is a unidirectional filtering mode, then it can only be selected and reused from the cache of historical unidirectional filters; if the tf_filter_mode parsed from the current image / slice is a bidirectional filtering mode, then it can only be selected and reused from the cache of historical bidirectional filters. That is to say, in related technologies, because unidirectional and bidirectional filters use different filter shapes, they cannot reuse each other's filters. In this embodiment, due to the limited TF_APS resources (e.g., eight TF_APS), only eight unidirectional and bidirectional filters can be stored in total, and only eight filter-related parameters based on the reference position can be stored in the bitstream. Therefore, in one possible implementation, the shape of the filters and the number of filter coefficients used in unidirectional and bidirectional filtering can be unified; in this way, unidirectional and bidirectional filtering modes can reuse each other's filters. However, there is a problem: the bidirectional filtering mode can only use one filter, that is, only one input information is allowed. Based on this, in one possible implementation, in the bidirectional filtering mode, input information 1 is first determined based on the reference image in one direction and input information 2 is determined based on the reference image in the other direction. Then, the average of input information 1 and input information 2 is used as input information 3, and then input information 3 is input into the filter.
[0408] In some embodiments, unidirectional and bidirectional filters coexist in the TF_APS type APS dataset. Unlike related technologies, where each unidirectional and bidirectional filtering mode can store 8 sets of filters, a total of 8 sets of filters can be cached in 8 APS for the current image candidate. This reduces the number of selectable filters, which may affect the compression performance of encoding and decoding. For this reason, in this embodiment, it is further proposed to integrate the shapes of unidirectional and bidirectional filters so that they can be reused. This means that the filtering coefficients in the unidirectional filtering mode can be reused in the bidirectional filtering mode, and the coefficients in the bidirectional filtering mode can also be reused in the unidirectional filtering mode.
[0409] The specific improvement method involves unifying the shapes of unidirectional and bidirectional filters and changing the filtering process of the bidirectional filtering mode. One specific improvement method is as follows:
[0410] First, in the embodiments of this application, multi-shape TF filters (i.e., reference position-based filters) are used. Figure 16 is a schematic diagram of two filter shapes provided by the embodiments of this application that are suitable for both unidirectional and bidirectional filtering. As shown in Figure 16, they are both 13-coefficient, symmetrical filters.
[0411] During decoding, if the selected filtering mode is unidirectional filtering (i.e., tf_filter_mode = 0, 1, 3, 4 or sh_tf_filter_mode = 0, 1, 3, 4), the filtering process of the filter is still the same as the related technology.
[0412] If the selected filtering mode is bidirectional filtering (i.e., tf_filter_mode=2,5 or sh_tf_filter_mode=2,5), the filtering process should first add the corresponding reference positions on the two reference images in pairs and multiply them by the filtering coefficients. The output of the filter should also be further divided by 2.
[0413] This section uses the filter shape shown on the right side of Figure 16 as an example to describe how the improved bidirectional filtering works. (If the filter shape on the left side of Figure 16 is selected, the relative positions of the reference pixels at the filter input should also be improved accordingly.) A comparison of the specific modifications is described below.
[0414] rec0 and rec1 are the reconstructed pixel arrays of the two reconstructed images in the bidirectional filtering process. (x0, y0) and (x1, y1) are the reference positions pointed to by the same position or Mv / Bv in these two reconstructed images, respectively. rec is the output reconstructed value of the SAO filter in the current image. rec' is the value after bidirectional TF filtering.
[0415] From the example of the filtering process given above, we can easily see that, firstly, for the bidirectional filtering mode, each filter coefficient is multiplied by the sum of the reconstructed pixels in the two reference images; secondly, when calculating the output, the result of the summation in the previous step is further divided by 2.
[0416] Comparing the improved bidirectional filtering process described above with the unidirectional filtering process in related techniques reveals that the coefficients of the bidirectional filter are equivalent to those of the unidirectional filter under the same filter shape. The advantage of this is that since both unidirectional and bidirectional filtering use the same filter shape, there's no need to distinguish whether the filter coefficients stored in TF_APS are for unidirectional or bidirectional filtering. Filtering can be performed simply by resolving an APS and a filter mode index from the current image or slice. This improves the accuracy of reconstructed value correction, thereby saving bitstream.
[0417] The following describes skipping certain positions of the TF filter.
[0418] As mentioned earlier, the filtering parameters of TF were moved from the image header and slice header to the APS. Furthermore, by integrating the shapes of unidirectional and bidirectional filters, unidirectional and bidirectional filtering can be reused. This increases the number of combinations the encoder can select and the range of options. Consequently, the improved TF selects a higher proportion than related techniques. This may cause the decoder to need to use TF for filtering on more coding tree units, increasing decoding time. To address this issue, the following methods can be used to mitigate the increased decoding time caused by TF technology.
[0419] The key to reducing decoding time lies in reducing the use of TF (Transformer-Turbo) technology at filtering positions where the effect is not obvious. For each filtering position in the coding tree unit with selected TF mode filtering, the current block to which that position belongs can be obtained to check whether the current block meets the following conditions:
[0420] (1) Whether the current block is selected for inter-frame prediction. If it is not inter-frame prediction, the TF filtering at this position can be skipped because the correlation between this position and the reference position is not strong. This is mainly because the correlation is not strong and the effect of correction is not obvious.
[0421] (2) Whether the current block is selected for IBC prediction. If it is not IBC prediction, the TF filtering at this position can be skipped because the correlation between this position and the reference position is not strong. This is mainly because the correlation is not strong and the effect of correction is not obvious.
[0422] (3) Whether the current block contains residual coefficients, i.e. whether the residuals in the current block are all zero. If the current block has all zero residuals, then the position is extremely similar to the reference position. The reference position may not have more information that can be used for filtering, so the TF filtering at this position can be skipped.
[0423] (4) When the prediction mode of the current block selection is inter-frame prediction, is its motion vector greater than a certain preset value? If the motion vector is greater than a certain preset value, the TF filtering at that position can be skipped because the reference position may not be strongly correlated. This is mainly because the correlation is not strong and the correction effect is not obvious.
[0424] (5) When the prediction mode of the current block is IBC, is its block vector less than or greater than a preset value? If the block vector is greater than a certain preset value, the TF filtering at that position can be skipped because the reference position may not be strongly correlated.
[0425] When one or more of the above conditions are met, even if the coding tree unit to which the current position belongs has selected the TF filtering mode, the filtering at the current position can be skipped, thereby reducing complexity.
[0426] As can be understood from the above scheme, in this embodiment, the filtering-related parameters of TF are placed in the APS (Adaptive Processing System) rather than in image-level or slice-level caches. This is because regardless of whether the image-level or slice-level encoding / decoding of TF adaptive filtering parameters is required, corresponding caches are needed to handle the storage of the parsed coefficients. However, the inventors of this application have found through research and analysis that the updates and parsing of these filtering-related parameters are not actually that frequent and do not need to be updated at the image-level or slice-level. Therefore, placing them in the APS reduces the cache size at the slice-level and image-level. In addition, since the adaptive parameters of other tools in VVC and ECM are also parsed and stored in the APS, placing the filtering-related parameters of TF in the APS also makes the design more consistent with other tools.
[0427] Secondly, in this embodiment, the filter shapes of the unidirectional filtering mode and the bidirectional filtering mode are further unified, so that the filter-related parameters in TF_APS can be reused in both unidirectional and bidirectional filtering modes. Thus, by simply combining the current image / slice with the selected filter-related parameters in TF_APS using the parsed filtering mode, more diverse multiplexing methods can be created.
[0428] It is understood that, in the embodiments of this application,
[0429] (1) The filter shapes of the one-way and two-way filtering modes in the adaptive filtering technology based on reference images are unified, and the filter coefficients are also transmitted in the bit stream in the form of APS.
[0430] (2) Based on (1), when the filter coefficients are encoded and decoded in the APS, they need to be distinguished by an additional APS category. In this embodiment, the adaptive parameters of TF will be further decoded when the APS type is TF_APS.
[0431] (3) Based on (2), since TF filtering can be used on the luminance component, the parsing of chroma_present_flag can be skipped when the type of APS is TF_APS.
[0432] (4) Based on (1), after the TF enable flag is encoded and decoded in the image header and the title block, it is necessary to further encode and decode the corresponding filter-related parameters in which one or several TF_APS types of APS.
[0433] (5) Based on (1), in the syntax elements at the coding tree unit level, it is also necessary to determine which TF_APS type APS filter-related parameters are selected for filtering by parsing tf_setId.
[0434] Based on the same inventive concept as the foregoing embodiments, this application provides an encoder; Figure 17 is a schematic diagram of the composition structure of the encoder provided in this application. As shown in Figure 17, the encoder 170 may include a first determining unit 1701 and a first filtering unit 1702, wherein:
[0435] The first determining unit 1701 is configured to determine the filtering correlation parameters of the current block, wherein the filtering correlation parameters are the filtering correlation parameters of the filter based on the reference position; wherein the reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image.
[0436] The first determining unit 1701 is configured to determine the filter coefficients of the filter in the current block based on the filter-related parameters of the current block.
[0437] The first filtering unit 1702 is configured to filter the reconstructed value of the current position based on the reference position of the current position and the filtering coefficients of the filter of the current block.
[0438] In some embodiments, encoder 170 further includes encoding unit 1703; wherein, first determining unit 1701 is further configured to determine the values of first syntax element and second syntax element; wherein, the first syntax element is used to indicate the index of the APS used for filtering the current block in the first APS candidate list; the second syntax element is used to indicate the index of the filtering related parameters of the current block in the APS indicated by the first syntax element; encoding unit 1703 is configured to encode the values of the first syntax element and the second syntax element and write the obtained encoded bits into the bitstream.
[0439] In some embodiments, the first determining unit 1701 is configured to: determine the number of APSs used in the current slice or the current image; and if the number of APSs used in the current slice or the current image is greater than 1, determine the values of the first syntax element and the second syntax element; the APS includes filtering-related parameters of one or more filters.
[0440] In some embodiments, encoder 170 further includes encoding unit 1703; wherein, the first determining unit 1701 is further configured to determine the value of the third syntax element based on the number of APS used by the current slice or the current image; wherein, the third syntax element is used to indicate the number of APS used by the current slice or the current image; the encoding unit 1703 is configured to encode the value of the third syntax element and write the obtained encoded bits into the bitstream.
[0441] In some embodiments, the first determining unit 1701 is configured to determine the values of the first syntax element and the second syntax element when it is determined that the current block uses a reference position-based filtering technique and the number of APS used in the current slice or the current image is greater than 1.
[0442] In some embodiments, the encoder 170 further includes an encoding unit 1703; wherein the first determining unit 1701 is further configured to determine the value of the fourth syntax element based on the current block using a reference position-based filtering technique; and the encoding unit 1703 is configured to encode the value of the fourth syntax element and write the obtained encoded bits into the bitstream.
[0443] In some embodiments, the first determining unit 1701 is further configured to determine whether the current block uses a reference position-based filtering technique if it is determined that the current slice or the current image is filtered using a filter based on the reference position.
[0444] In some embodiments, the encoder 170 further includes an encoding unit 1703; wherein the first determining unit 1701 is further configured to perform filtering based on a reference position filter according to the current slice or the current image to determine the value of the fifth syntax element; the encoding unit 1703 is configured to encode the value of the fifth syntax element and write the obtained encoded bits into the bitstream.
[0445] In some embodiments, the first determining unit 1701 is further configured to: determine the index of at least one APS used by the current slice or the current image; and construct a first APS candidate list based on the index of at least one APS used by the current slice or the current image and the corresponding APS.
[0446] In some embodiments, the encoder 170 further includes an encoding unit 1703; wherein the first determining unit 1701 is further configured to determine the value of the seventh syntax element based on the index of at least one APS used by the current slice or the current image; the encoding unit 1703 is configured to encode the value of the seventh syntax element and write the obtained encoded bits into the bitstream.
[0447] In some embodiments, the control identifier of the filtering technique based on reference position includes a third syntax element and a seventh syntax element; the encoder 170 further includes an encoding unit 1703; wherein, the first determining unit 1701 is further configured to, when the control identifier exists in the image header of the bitstream, take the value of the eighth syntax element as a first value, and the encoding unit 1703 is configured to encode the value of the eighth syntax element and write the obtained encoded bits into the bitstream; or, the first determining unit 1701 is further configured to, when the control identifier exists in the chip header of the bitstream, take the value of the eighth syntax element as a second value, and the encoding unit 1703 is configured to encode the value of the eighth syntax element and write the obtained encoded bits into the bitstream.
[0448] In some embodiments, the control identifier for the reference position-based filtering technique includes a third syntax element and a seventh syntax element; if the control identifier exists in the header of the bitstream, the location of the control identifier is not indicated in the bitstream.
[0449] In some embodiments, the encoder 170 further includes an encoding unit 1703; wherein the first determining unit 1701 is further configured to determine the values of the ninth syntax element and the tenth syntax element; wherein the ninth syntax element is used to indicate the index of the APS currently carried in the bitstream; the tenth syntax element is used to indicate the type of the APS currently carried in the bitstream; the encoding unit 1703 is configured to encode the values of the ninth syntax element and the tenth syntax element and write the obtained encoded bits into the bitstream.
[0450] In some embodiments, encoder 170 further includes encoding unit 1703; wherein the first determining unit 1701 is further configured to determine one or more of the following syntax elements based on the filter correlation parameters of the filter based on the reference position in the APS corresponding to the value of the ninth syntax element:
[0451] The eleventh syntax element; the eleventh syntax element is used to indicate the number of filters corresponding to the filter-related parameters in the current APS;
[0452] The twelfth syntax element; the twelfth syntax element is used to indicate the absolute value of the filter coefficients in the current APS;
[0453] The thirteenth syntax element; the thirteenth syntax element is used to indicate the sign of the filter coefficients in the current APS;
[0454] The fourteenth syntax element; the fourteenth syntax element is used to indicate the precision of the filter coefficients in the current APS;
[0455] The fifteenth syntax element; the fifteenth syntax element is used to indicate the encoding method of the filter coefficients in the current APS;
[0456] The sixteenth syntax element; the sixteenth syntax element is used to indicate whether the filter in the current APS uses nonlinear truncation;
[0457] The seventeenth syntax element; the seventeenth syntax element is used to indicate the index of the nonlinear truncation used by the filter in the current APS;
[0458] The encoding unit 1703 is configured to encode the values of one or more syntax elements from the eleventh to the seventeenth syntax elements and write the resulting encoded bits into the bit stream.
[0459] In some embodiments, the filters corresponding to the reference position in the APS indicated in the bitstream have the same shape, and the filter-related parameters of the reference position-based filters in each APS are used for unidirectional filtering and bidirectional filtering.
[0460] In some embodiments, the first filtering unit 1702 is configured to: determine the filtering mode of the current position; and filter the reconstructed value of the current position according to the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block.
[0461] Furthermore, in some embodiments, the filtering mode of the current position is a first mode, which is a bidirectional filtering mode based on inter-frame prediction; the reference position of the current position includes a first reference position and a second reference position; wherein, the first reference position is the position determined by the first MV of the current position on the forward inter-frame reference image, and the second reference position is the position determined by the second MV of the current position on the backward inter-frame reference image.
[0462] Furthermore, in some embodiments, the first filtering unit 1702 is configured as follows:
[0463] When the filtering mode at the current position is the first mode, the first input information of the filter of the current block is obtained based on the first reference position, the forward inter-frame reference image and the shape of the filter of the current block.
[0464] Based on the second reference position, the backward inter-frame reference image, and the shape of the filter in the current block, the second input information of the filter in the current block is obtained; wherein, the filter corresponding to the first input information and the second input information is the same filter;
[0465] Based on the first and second input information, determine the third input information;
[0466] The reconstructed value at the current position is filtered based on the third input information and the filter coefficients of the current block's filter.
[0467] Furthermore, in some other embodiments, the filtering mode of the current position is a second mode, which is a bidirectional filtering mode based on the co-position; the reference position of the current position includes a third reference position and a fourth reference position; wherein, the third reference position is the co-position of the current position on the nearest forward reference image of the current image, and the fourth reference position is the co-position of the current position on the nearest backward reference image of the current image.
[0468] Furthermore, in some other embodiments, the first filtering unit 1702 is configured as follows:
[0469] When the filtering mode at the current position is the second mode, the fourth input information of the filter of the current block is obtained based on the third reference position, the nearest forward reference image and the shape of the filter of the current block;
[0470] Based on the fourth reference position, the backward nearest reference image, and the shape of the filter in the current block, the fifth input information of the filter in the current block is obtained; wherein the filter corresponding to the fourth input information and the fifth input information is the same filter;
[0471] Based on the fourth and fifth input information, determine the sixth input information;
[0472] The reconstructed value at the current position is filtered based on the sixth input information and the filter coefficients of the current block's filter.
[0473] Furthermore, in some other embodiments, the first filtering unit 1702 is configured as follows:
[0474] If the filtering mode at the current position is any of the third to sixth modes, the seventh input information of the filter of the current block is obtained based on the reference position of the current position, the image where the reference position is located, and the shape of the filter of the current block; wherein, the third to sixth modes are all unidirectional filtering modes.
[0475] The reconstructed value at the current position is filtered based on the seventh input information and the filter coefficients of the current block's filter.
[0476] Furthermore, in some embodiments, the encoder 170 further includes an encoding unit 1703; wherein the first determining unit 1701 is further configured to determine the filtering mode of the current image or current slice when it is determined that the current image or current slice uses a filtering technique based on reference position; and to determine the value of the eighteenth syntax element according to the filtering mode of the current image or current slice; the encoding unit 1703 is configured to encode the value of the eighteenth syntax element and write the obtained encoded bits into the bitstream.
[0477] In some embodiments, if the current block satisfies a first condition, filtering of the reconstructed value at the current position is skipped;
[0478] The first condition includes one or more of the following conditions:
[0479] Inter-frame prediction mode is not selected for the current block;
[0480] The IBC prediction mode is not selected for the current block.
[0481] The residual samples for the current block all have a value of 0;
[0482] The prediction mode of the current block is inter-frame prediction mode, and the horizontal displacement component in the motion vector used is greater than or equal to the first threshold and / or the vertical displacement component in the motion vector used is greater than or equal to the second threshold.
[0483] The current block's prediction mode is IBC prediction mode, and the horizontal displacement component in the block vector used is greater than or equal to the third threshold and / or the vertical displacement component in the motion vector used is greater than or equal to the fourth threshold.
[0484] The description of the encoder embodiments above is similar to the description of the encoding / decoding method embodiments above, and has similar beneficial effects. For technical details not disclosed in the encoder embodiments of this application, please refer to the description of the encoding / decoding method embodiments of this application for understanding.
[0485] Figure 18 is a schematic diagram of the hardware structure of the encoder provided in an embodiment of this application. As shown in Figure 18, the encoder 180 may include: a first communication interface 1801, a first memory 1802, and a first processor 1803; the various components are coupled together through a first bus system 1804. It can be understood that the first bus system 1804 is used to realize the connection and communication between these components. In addition to a data bus, the first bus system 1804 also includes a power bus, a control bus, and a status signal bus. However, for clarity, all buses are labeled as the first bus system 1804 in Figure 18.
[0486] The first communication interface 1801 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
[0487] The first memory 1802 is used to store computer programs that can run on the first processor 1803;
[0488] The first processor 1803, when running the computer program, performs the following actions: determining the filtering correlation parameters of the current block, wherein the filtering correlation parameters are filtering correlation parameters of a filter based on a reference position; wherein the reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image; determining the filtering coefficients of the filter for the current block based on the filtering correlation parameters of the current block; and filtering the reconstructed value of the current position based on the reference position of the current position and the filtering coefficients of the filter for the current block.
[0489] It is understood that the first memory 1802 in the embodiments of this application can be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. The non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. The volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced Synchronous DRAM (ESDRAM), Synchlink DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The first memory 1802 of the system and method described in this application is intended to include, but is not limited to, these and any other suitable types of memory.
[0490] The first processor 1803 may be an integrated circuit chip with signal processing capabilities. In implementation, each step of the above method can be completed by the integrated logic circuitry in the hardware of the first processor 1803 or by instructions in software form. The first processor 1803 may be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor may be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this application can be directly embodied in the execution of a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules may reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. The storage medium is located in the first memory 1802. The first processor 1803 reads the information in the first memory 1802 and completes the steps of the above method in conjunction with its hardware.
[0491] It is understood that the embodiments described in this application can be implemented using hardware, software, firmware, middleware, microcode, or a combination thereof. For hardware implementation, the processing unit can be implemented in one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), general-purpose processors, controllers, microcontrollers, microprocessors, other electronic units for performing the functions described in this application, or combinations thereof. For software implementation, the technology described in this application can be implemented through modules (e.g., procedures, functions, etc.) that perform the functions described in this application. Software code can be stored in memory and executed by a processor. The memory can be implemented in the processor or external to the processor.
[0492] Alternatively, as another embodiment, the first processor 1803 is also configured to perform the method described in any of the foregoing embodiments when running the computer program.
[0493] Based on the same inventive concept as the foregoing embodiments, FIG19 is a schematic diagram of the composition structure of the decoder provided in the embodiment of this application. As shown in FIG19, the decoder 190 may include a first acquisition unit 1901, a second determination unit 1902, and a second filtering unit 1903, wherein:
[0494] The first acquisition unit 1901 is configured to acquire the filter correlation parameters of the current block from the APS. The filter correlation parameters are the filter correlation parameters of the filter based on the reference position. The reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image.
[0495] The second determining unit 1902 is configured to determine the filter coefficients of the filter in the current block based on the filter-related parameters of the current block.
[0496] The second filtering unit 1903 is configured to filter the reconstructed value of the current position based on the reference position of the current position and the filtering coefficients of the filter of the current block.
[0497] In some embodiments, the decoder 190 further includes a decoding unit 1904; wherein the decoding unit 1904 is configured to: parse a first syntax element and a second syntax element in the bitstream; wherein the first syntax element is used to indicate the index of the APS used for filtering the current block in the first APS candidate list; the second syntax element is used to indicate the index of the filtering-related parameters of the current block in the APS indicated by the first syntax element; and the first acquisition unit 1901 is configured to acquire the filtering-related parameters of the current block from the first APS candidate list according to the values of the first syntax element and the second syntax element.
[0498] Furthermore, in some embodiments, the decoding unit 1904 is configured to: parse a third syntax element in the bitstream; wherein the third syntax element is used to indicate the number of APSs used in the current slice or the current image; the APS includes filtering correlation parameters of one or more filters; and parse a first syntax element and a second syntax element in the bitstream when the value of the third syntax element is greater than 0.
[0499] Furthermore, in some embodiments, the decoding unit 1904 is configured to: parse the fourth syntax element in the bitstream; and parse the first and second syntax elements in the bitstream when the value of the fourth syntax element indicates that the current block uses a reference position-based filtering technique and the value of the third syntax element is greater than 0.
[0500] Furthermore, in some embodiments, the decoding unit 1904 is configured to: parse a fifth syntax element in the bitstream; and parse a fourth syntax element in the bitstream when the value of the fifth syntax element indicates that the current slice or current image uses a reference position-based filtering technique.
[0501] In some embodiments, the decoder 190 further includes a construction unit 1905; wherein the decoding unit 1904 is configured to: parse a seventh syntax element in the bitstream according to a third syntax element; wherein the seventh syntax element is used to indicate the index of the APS used by the current slice or the current image; the construction unit 1905 is configured to: obtain the corresponding APS according to the value of at least one parsed seventh syntax element; and construct a first APS candidate list according to the value of at least one parsed seventh syntax element and the corresponding APS.
[0502] In some embodiments, the decoding unit 1904 is configured to: parse an eighth syntax element in the bitstream; wherein the eighth syntax element is used to indicate the location of a control identifier for a filtering technique based on a reference position; if the value of the eighth syntax element indicates that the control identifier exists in the image header, parse a third syntax element carried in the image header of the bitstream, and parse a seventh syntax element carried in the image header of the bitstream according to the third syntax element; or, if the value of the eighth syntax element indicates that the control identifier exists in the slice header, parse a third syntax element carried in the slice header of the bitstream, and parse a seventh syntax element carried in the slice header of the bitstream according to the third syntax element; or, if the eighth syntax element is not obtained when parsing the bitstream, parse a third syntax element carried in the slice header of the bitstream, and parse a seventh syntax element carried in the slice header of the bitstream according to the third syntax element.
[0503] In some embodiments, the decoding unit 1904 is configured to: parse the ninth syntax element and the tenth syntax element in the bitstream; wherein the ninth syntax element is used to indicate the index of the APS currently carried in the bitstream; the tenth syntax element is used to indicate the type of the APS currently carried in the bitstream; when the value of the tenth syntax element indicates that the type of the APS is a reference position-based filtering type, the filtering-related parameters of the reference position-based filter currently carried in the bitstream are parsed; and according to the value of the ninth syntax element, the filtering-related parameters of the corresponding parsed reference position-based filter are saved to the APS at the corresponding index.
[0504] In some embodiments, the construction unit 1905 is configured to: obtain the corresponding APS from the pre-stored APS according to the value of the parsed seventh syntax element.
[0505] In some embodiments, the filters corresponding to the pre-stored APS have the same shape, and the filter correlation parameters of the filter based on the reference position in each APS are used for unidirectional filtering and bidirectional filtering.
[0506] In some embodiments, the decoding unit 1904 is configured to: parse one or more of the following syntax elements currently carried in the bitstream, and determine the filter-related parameters of the currently carried reference position-based filter:
[0507] The eleventh syntax element; the eleventh syntax element is used to indicate the number of filters corresponding to the filter-related parameters in the current APS;
[0508] The twelfth syntax element; the twelfth syntax element is used to indicate the absolute value of the filter coefficients in the current APS;
[0509] The thirteenth syntax element; the thirteenth syntax element is used to indicate the sign of the filter coefficients in the current APS;
[0510] The fourteenth syntax element; the fourteenth syntax element is used to indicate the precision of the filter coefficients in the current APS;
[0511] The fifteenth syntax element; the fifteenth syntax element is used to indicate the encoding method of the filter coefficients in the current APS;
[0512] The sixteenth syntax element; the sixteenth syntax element is used to indicate whether the filter in the current APS uses nonlinear truncation;
[0513] The seventeenth syntax element; the seventeenth syntax element is used to indicate the index of the nonlinear truncation used by the filter in the current APS.
[0514] In some embodiments, the second filtering unit 1903 is configured to: determine the filtering mode of the current position; and filter the reconstructed value of the current position according to the reference position of the current position, the filtering mode of the current position, and the filtering coefficients of the filter of the current block.
[0515] Furthermore, in some embodiments, the filtering mode of the current position is a first mode, which is a bidirectional filtering mode based on inter-frame prediction; the reference position of the current position includes a first reference position and a second reference position; wherein, the first reference position is the position determined by the first MV of the current position on the forward inter-frame reference image, and the second reference position is the position determined by the second MV of the current position on the backward inter-frame reference image.
[0516] Furthermore, in some embodiments, the second filtering unit 1903 is configured as follows:
[0517] When the filtering mode at the current position is the first mode, the first input information of the filter of the current block is obtained based on the first reference position, the forward inter-frame reference image and the shape of the filter of the current block; the second input information of the filter of the current block is obtained based on the second reference position, the backward inter-frame reference image and the shape of the filter of the current block; wherein the filter corresponding to the first input information and the second input information is the same filter.
[0518] Based on the first and second input information, determine the third input information;
[0519] The reconstructed value at the current position is filtered based on the third input information and the filter coefficients of the current block's filter.
[0520] Furthermore, in some other embodiments, the filtering mode of the current position is a second mode, which is a bidirectional filtering mode based on the co-position; the reference position of the current position includes a third reference position and a fourth reference position; wherein, the third reference position is the co-position of the current position on the nearest forward reference image of the current image, and the fourth reference position is the co-position of the current position on the nearest backward reference image of the current image.
[0521] Furthermore, in some other embodiments, the second filtering unit 1903 is configured as follows:
[0522] When the filtering mode at the current position is the second mode, the fourth input information of the filter of the current block is obtained based on the third reference position, the nearest forward reference image, and the shape of the filter of the current block; the fifth input information of the filter of the current block is obtained based on the fourth reference position, the nearest backward reference image, and the shape of the filter of the current block; wherein the filter corresponding to the fourth input information and the fifth input information is the same filter.
[0523] Based on the fourth and fifth input information, determine the sixth input information;
[0524] The reconstructed value at the current position is filtered based on the sixth input information and the filter coefficients of the current block's filter.
[0525] Furthermore, in some other embodiments, the second filtering unit 1903 is configured to: when the filtering mode at the current position is any one of the third to sixth modes, obtain the seventh input information of the filter of the current block based on the reference position at the current position, the image where the reference position is located, and the shape of the filter of the current block; wherein the third to sixth modes are all unidirectional filtering modes; and filter the reconstructed value at the current position based on the seventh input information and the filtering coefficient of the filter of the current block.
[0526] In some embodiments, the decoding unit 1904 is configured to: parse the eighteenth syntax element in the bitstream when it is determined that the current image or current slice uses a reference position-based filtering technique; the second filtering unit 1903 is configured to determine the filtering mode of the current position based on the value of the eighteenth syntax element.
[0527] In some embodiments, if the current block satisfies a first condition, filtering of the reconstructed value at the current position is skipped;
[0528] The first condition includes one or more of the following conditions:
[0529] Inter-frame prediction mode is not selected for the current block;
[0530] The IBC prediction mode is not selected for the current block.
[0531] The residual samples for the current block all have a value of 0;
[0532] The prediction mode of the current block is inter-frame prediction mode, and the horizontal displacement component in the motion vector used is greater than or equal to the first threshold and / or the vertical displacement component in the motion vector used is greater than or equal to the second threshold.
[0533] The current block's prediction mode is IBC prediction mode, and the horizontal displacement component in the block vector used is greater than or equal to the third threshold and / or the vertical displacement component in the motion vector used is greater than or equal to the fourth threshold.
[0534] The description of the decoder embodiments above is similar to the description of the encoding / decoding method embodiments above, and has similar beneficial effects. For technical details not disclosed in the decoder embodiments of this application, please refer to the description of the encoding / decoding method embodiments of this application for understanding.
[0535] Understandably, in this embodiment, a "unit" can be a portion of a circuit, a portion of a processor, a portion of a program or software, etc., and can also be a module or a non-modular component. Furthermore, the components in this embodiment can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional module.
[0536] Figure 20 is a schematic diagram of the hardware structure of the decoder provided in an embodiment of this application. As shown in Figure 20, the decoder 200 may include: a second communication interface 2001, a second memory 2002, and a second processor 2003; the various components are coupled together through a second bus system 2004. It is understood that the second bus system 2004 is used to realize the connection and communication between these components. In addition to a data bus, the second bus system 2004 also includes a power bus, a control bus, and a status signal bus. However, for clarity, all buses are labeled as the second bus system 2004 in Figure 20.
[0537] The second communication interface 2001 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
[0538] The second memory 2002 is used to store computer programs that can run on the second processor 2003;
[0539] The second processor 2003 is configured to, when running the computer program, perform:
[0540] Obtain the filter correlation parameters of the current block from the APS. These filter correlation parameters are filter correlation parameters of the filter based on the reference position. The reference position includes the position pointed to by the BV of the current position, the position pointed to by the MV of the current position, or the co-position of the current position on the reference image.
[0541] Based on the filtering parameters of the current block, determine the filter coefficients of the filter in the current block;
[0542] The reconstructed value at the current position is filtered based on the reference position of the current position and the filter coefficients of the filter for the current block.
[0543] Alternatively, as another embodiment, the second processor 2003 is also configured to perform the method described in any of the foregoing embodiments when running the computer program.
[0544] It is understood that the second memory 2002 has similar hardware functions to the first memory 1802, and the second processor 2003 has similar hardware functions to the first processor 2003; these will not be described in detail here.
[0545] Figure 21 is a schematic diagram of the composition structure of an encoding and decoding system provided in an embodiment of this application. As shown in Figure 21, the encoding and decoding system 210 may include an encoder 2101 and a decoder 2102.
[0546] In this embodiment, encoder 2101 can be any of the encoders described in the foregoing embodiments, and decoder 2102 can be any of the decoders described in the foregoing embodiments.
[0547] In some embodiments, this application also provides a computer-readable storage medium storing a computer program thereon. When executed by a processor, the computer program implements the method as described in any of the foregoing embodiments. Specifically, when executed by a first processor, the computer program implements the encoding method as described in any of the foregoing embodiments, or when executed by a second processor, it implements the decoding method as described in any of the foregoing embodiments.
[0548] In some embodiments, this application also provides a computer program product, including a computer program or instructions. When executed by a processor, the computer program or instructions implement the method as described in any of the foregoing embodiments. Specifically, when executed by a first processor, the computer program or instructions implement the encoding method as described in any of the foregoing embodiments, or when executed by a second processor, they implement the decoding method as described in any of the foregoing embodiments.
[0549] In some embodiments, this application also provides a computer program that, when executed by a processor, implements the method as described in any of the foregoing embodiments. Specifically, when executed by a first processor, the computer program or instructions implement the encoding method as described in any of the foregoing embodiments, or when executed by a second processor, implement the decoding method as described in any of the foregoing embodiments.
[0550] In some embodiments, this application also provides a computer-readable storage medium storing a bitstream thereon. The bitstream is generated by performing the steps of the encoding method as described in any of the foregoing embodiments.
[0551] In this embodiment of the application, the information to be encoded in the encoding method includes one or more syntax elements mentioned above. Here, this information to be encoded is encoded and written into the bitstream.
[0552] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed in this application can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0553] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working process of the above-described apparatus and unit can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.
[0554] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.
[0555] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0556] In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
[0557] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0558] It should be noted that, in this application, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element.
[0559] The sequence numbers of the embodiments in this application are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.
[0560] The methods disclosed in the several method embodiments provided in this application can be arbitrarily combined without conflict to obtain new method embodiments.
[0561] The features disclosed in the several product embodiments provided in this application can be arbitrarily combined without conflict to obtain new product embodiments.
[0562] The features disclosed in the several method or device embodiments provided in this application can be arbitrarily combined without conflict to obtain new method or device embodiments.
[0563] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
A decoding method applied to a decoder, the method comprising: obtaining filter-related parameters of a current block from an APS, the filter-related parameters being filter-related parameters of a reference position-based filter; wherein the reference position comprises a position pointed by a BV of a current position, a position pointed by an MV of the current position, or a co-located position of the current position on a reference picture; determining filter coefficients of the filter of the current block according to the filter-related parameters of the current block; filtering a reconstructed value of the current position according to the reference position of the current position and the filter coefficients of the filter of the current block. The method of claim 1, wherein, The obtaining of the filter-related parameters of the current block from the APS comprises: parsing a first syntax element and a second syntax element in a bitstream; wherein the first syntax element is used to indicate an index of an APS used by the current block in a first APS candidate list; the second syntax element is used to indicate an index of filter-related parameters of the current block in the APS indicated by the first syntax element; obtaining the filter-related parameters of the current block from the first APS candidate list according to values of the first syntax element and the second syntax element. The method of claim 2, wherein, The parsing of the first syntax element and the second syntax element in the bitstream comprises: parsing a third syntax element in a bitstream; wherein the third syntax element is used to indicate a number of APSs used by a current slice or a current picture; the APSs comprise filter-related parameters of one or more filters; in a case where a value of the third syntax element is greater than 0, parsing a first syntax element and a second syntax element in a bitstream. The method of claim 3, wherein, The parsing of the first syntax element and the second syntax element in the case where the value of the third syntax element is greater than 0 comprises: parsing a fourth syntax element in a bitstream; in a case where a value of the fourth syntax element indicates that the current block uses a reference position-based filtering technology, and the value of the third syntax element is greater than 0, parsing a first syntax element and a The method of claim 4, wherein, The parsing of the fourth syntax element in the bitstream comprises: parsing a fifth syntax element in a bitstream; in a case where a value of the fifth syntax element indicates that a current slice or a current picture uses a reference position-based filtering technology, parsing a fourth syntax element in a bitstream. The method of claim 3, wherein, The method further comprises: parsing a seventh syntax element in a bitstream according to the third syntax element; wherein the seventh syntax element is used to indicate an index of an APS used by a current slice or a current picture; obtaining respective APSs according to values of at least one parsed seventh syntax element; constructing the first APS candidate list according to the values of at least one parsed seventh syntax element and the respective APSs. The method of claim 6, wherein, The parsing of the third syntax element in the bitstream and the parsing of the seventh syntax element in the bitstream according to the third syntax element comprises: parsing an eighth syntax element in a bitstream; wherein the eighth syntax element is used to indicate a position where a control identifier of a reference position-based filtering technology exists; In a case where the value of the eighth syntax element indicates that the control flag exists in the picture header, a third syntax element carried in the picture header in the bitstream is parsed, and a seventh syntax element carried in the picture header in the bitstream is parsed according to the third syntax element; Or, In a case where the value of the eighth syntax element indicates that the control flag exists in the slice header, the third syntax element carried in the slice header in the bitstream is parsed, and the seventh syntax element carried in the slice header in the bitstream is parsed according to the third syntax element; Or, In a case where the eighth syntax element is not parsed from the bitstream, the third syntax element carried in the slice header in the bitstream is parsed, and the seventh syntax element carried in the slice header in the bitstream is parsed according to the third syntax element. The method of claim 6, wherein, The method further comprises: parsing a ninth syntax element and a tenth syntax element in the bitstream; the ninth syntax element is used to indicate an index of an APS currently carried in the bitstream; the tenth syntax element is used to indicate a type of the APS currently carried in the bitstream; in a case where the value of the tenth syntax element indicates that the type of the APS is a reference position-based filter type, filter-related parameters of a reference position-based filter currently carried in the bitstream are parsed; according to the value of the ninth syntax element, the parsed filter-related parameters of the reference position-based filter are saved into the APS corresponding to the index. The method of claim 8, wherein, The obtaining of the corresponding APS according to the parsed value of the seventh syntax element comprises: according to the parsed value of the seventh syntax element, the corresponding APS is obtained from the APS stored in advance. The method of claim 9, wherein, The APSs stored in advance correspond to filters of the same shape, and the filter-related parameters of the reference position-based filter in each APS are used for unidirectional filtering and bidirectional filtering. The method of claim 8, wherein, The parsing of the filter-related parameters of the reference position-based filter currently carried in the bitstream comprises: one or more of the following syntax elements currently carried in the bitstream are parsed to determine the filter-related parameters of the reference position-based filter currently carried: an eleventh syntax element, which is used to indicate a number of filters corresponding to the filter-related parameters in the current APS; a twelfth syntax element, which is used to indicate absolute values of filter coefficients of the filter in the current APS; a thirteenth syntax element, which is used to indicate positive and negative properties of the filter coefficients of the filter in the current APS; a fourteenth syntax element, which is used to indicate precisions of the filter coefficients of the filter in the current APS; a fifteenth syntax element, which is used to indicate encoding modes of the filter coefficients of the filter in the current APS; a sixteenth syntax element, which is used to indicate whether the filter in the current APS uses nonlinear clipping; a seventeenth syntax element, which is used to indicate an index of the nonlinear clipping used by the filter in the current APS. The method of any one of claims 1-11, wherein The filtering of the reconstructed value of the current position according to the reference position of the current position and the filter coefficients of the current block comprises: determining a filtering mode of the current position; Filter the reconstructed value of the current position according to the reference position of the current position, the filter mode of the current position, and the filter coefficients of the filter of the current block. The method of claim 12, wherein, The filter mode of the current position is a first mode, and the first mode is a bi-directional filter mode based on inter-frame prediction; the reference position of the current position includes a first reference position and a second reference position; wherein the first reference position is a position determined on a forward inter-frame reference image based on a first MV of the current position, and the second reference position is a position determined on a backward inter-frame reference image based on a second MV of the current position. The method of claim 13, wherein, The filtering of the reconstructed value of the current position according to the reference position of the current position, the filter mode of the current position, and the filter coefficients of the filter of the current block includes: In a case where the filter mode of the current position is the first mode, first input information of the filter of the current block is obtained according to the first reference position, the forward inter-frame reference image, and the shape of the filter of the current block; Second input information of the filter of the current block is obtained according to the second reference position, the backward inter-frame reference image, and the shape of the filter of the current block; wherein the filters corresponding to the first input information and the second input information are the same filter; Third input information is determined according to the first input information and the second input information; The reconstructed value of the current position is filtered according to the third input information and the filter coefficients of the filter of the current block. The method of claim 12, wherein, The filter mode of the current position is a second mode, and the second mode is a bi-directional filter mode based on a co-located position; the reference position of the current position includes a third reference position and a fourth reference position; wherein the third reference position is a co-located position of the current position on a forward nearest reference image of a current image, and the fourth reference position is a co-located position of the current position on a backward nearest reference image of the current image. The method of claim 15, wherein, The filtering of the reconstructed value of the current position according to the reference position of the current location, the filter mode of the current position, and the filter coefficients of the filter of the current position includes: In a case where the filter mode of the current position is the second mode, fourth input information of the filter of the current block is obtained according to the third reference position, the forward nearest reference image, and the shape of the filter of the current block; Fifth input information of the filter of the current block is obtained according to the fourth reference position, the backward nearest reference image, and the shape of the filter of the current block; wherein the filters corresponding to fourth input information and fifth input information are the same filter; Sixth input information is determined according to the fourth input information and the fifth input information; The reconstructed value of the current position is filtered according to the sixth input information and the filter coefficients of the filter of the current block. The method of claim 12, wherein, The filtering of the reconstructed value of the current position according to the reference position of the current position, the filtering mode of the current position and the filter coefficient of the filter of the current block comprises: In a case where the filtering mode of the current position is any one of a third mode to a sixth mode, the seventh input information of the filter of the current block is obtained according to the reference position of the current position, the image in which the reference position is located and the shape of the filter of the current block, wherein the third mode to the sixth mode are all unidirectional filtering modes; The filtering of the reconstructed value of the current position according to the seventh input information and the filter coefficient of the filter of the current block. The method of any one of claims 12-17, wherein, The determination of the filtering mode of the current position comprises: In a case where it is determined that the current image or the current slice uses the reference position-based filtering technology, a eighteenth syntax element in a code stream is parsed to determine the filtering mode of the current position. The method according to any one of claims 1-18, wherein, In a case where the current block satisfies a first condition, the filtering of the reconstructed value of the current position is skipped; The first condition comprises one or more of the following conditions: The prediction mode of the current block does not select an inter prediction mode; The prediction mode of the current block does not select an IBC prediction mode; The value of the residual sample of the current block is all 0; The prediction mode of the current block is an inter prediction mode, and a horizontal displacement component in a used motion vector is greater than or equal to a first threshold value and / or a vertical displacement component in the used motion vector is greater than or equal to a second threshold value; The prediction mode of the current block is an IBC prediction mode, and a horizontal displacement component in a used block vector is greater than or equal to a third threshold value and / or a vertical displacement component in the used motion vector is greater than or equal to a fourth threshold value. A method of encoding, applied to an encoder, the method comprising: determining a filter-related parameter of a current block, the filter-related parameter being a filter-related parameter of a reference position-based filter; wherein the reference position comprises a position pointed to by a BV of a current position, a position pointed to by an MV of the current position or a co-located position of the current position on a reference image; determining a filter coefficient of a filter of the current block according to the filter-related parameter of the current block; filtering a reconstructed value of the current position according to the reference position of the current position and the filter coefficient of the filter of the current block. The method of claim 20, wherein, The method further comprises: determining values of a first syntax element and a second syntax element; wherein the first syntax element is used to indicate an index of an APS used by the current block in a first APS candidate list; and the second syntax element is used to indicate an index of the filter-related parameter of the current block in the APS indicated by the first syntax element; encoding the values of the first syntax element and the second syntax element, and writing obtained encoded bits into a code stream. The method of claim 21, wherein, The determination of the values of the first syntax element and the second syntax element comprises: determining a number of APSs used by a current slice or a current image; In a case where a number of APSs used by the current slice or the current picture is greater than 1, values of the first syntax element and the second syntax element are determined; the APSs include filter-related parameters of one or more filters. The method of claim 22, wherein, The method further includes: According to the number of APSs used by the current slice or the current picture, a value of a third syntax element is determined; the third syntax element is used to indicate the number of APSs used by the current slice or the current picture; The value of the third syntax element is encoded, and the obtained encoded bits are written into the bitstream. The method of claim 22, wherein, The method further includes: In a case where the current block uses the reference location-based filtering technology and the number of APSs used by the current slice or the current picture is greater than 1, values of the first syntax element and the second syntax element are determined. The method of claim 23, wherein, The method further includes: According to the current block using the reference location-based filtering technology, a value of a fourth syntax element is determined; The value of the fourth syntax element is encoded, and the obtained encoded bits are written into the bitstream. The method of claim 24, wherein, The method further includes: In a case where the current slice or the current picture uses the reference location-based filter for filtering, it is determined whether the current block uses the reference location-based filtering technology. The method of claim 26, wherein, The method further includes: According to the current slice or the current picture using the reference location-based filter for filtering, a value of a fifth syntax element is determined; The value of the fifth syntax element is encoded, and the obtained encoded bits are written into the bitstream. The method of claim 21, wherein, The method further includes: An index of at least one APS used by the current slice or the current picture is determined; According to the index of the at least one APS used by the current slice or the current picture and the respective APS, the first APS candidate list is constructed. The method of claim 28, wherein, The method further includes: According to the index of the at least one APS used by the current slice or the current picture, a value of a seventh syntax element is determined; The value of the seventh syntax element is encoded, and the obtained encoded bits are written into the bitstream. The method of claim 29, wherein The control identifier of the reference location-based filtering technology includes the third syntax element and the seventh syntax element; The method further includes: In a case where the control identifier exists in a picture header of the bitstream, a value of an eighth syntax element is a first value, and the value of the eighth syntax element is encoded, and the obtained encoded bits are written into the bitstream; or, In a case where the control identifier exists in a slice header of the bitstream, a value of an eighth syntax element is a second value, and the value of the eighth syntax element is encoded, and the obtained encoded bits are writen into the bitstream; or In a case where the control identifier exists in a slice header of the bitstream, a position of the control identifier in the bitstream is not indicated. The method of claim 28, wherein, The method further includes: Values of a ninth syntax element and a tenth syntax element are determined; the ninth syntax element is used to indicate an index of an APS currently carried in the bitstream; the tenth syntax element is used to indicate a type of the APS currently carried in the bitstream; The values of the ninth syntax element and the tenth syntax element are encoded, and the obtained encoding bits are written into the code stream. The method of claim 31, wherein, The method further includes: According to the filter-related parameters of the reference position-based filter in the APS corresponding to the value of the ninth syntax element, one or more of the following syntax elements are determined: An eleventh syntax element, which is used to indicate the number of filters corresponding to the filter-related parameters in the current APS; A twelfth syntax element, which is used to indicate the absolute values of the filter coefficients of the filter in the current APS; A thirteenth syntax element, which is used to indicate the positive or negative of the filter coefficients of the filter in the current APS; A fourteenth syntax element, which is used to indicate the precision of the filter coefficients of the filter in the current APS; A fifteenth syntax element, which is used to indicate the encoding mode of the filter coefficients of the filter in the current APS; A sixteenth syntax element, which is used to indicate whether the filter in the current APS uses nonlinear clipping; A seventeenth syntax element, which is used to indicate the index of the nonlinear clipping used by the filter in the current APS; The values of one or more of the eleventh syntax element to the seventeenth syntax element are encoded, and the obtained encoding bits are written into the code stream. The method of claim 32, wherein, The shapes of the reference position-based filters in the APSs indicated in the code stream are the same, and the filter-related parameters of the reference position-based filters in each APS are used for unidirectional filtering and bidirectional filtering. The method of any one of claims 20-33, wherein, The filtering of the reconstructed value of the current position according to the reference position of the current position and the filter coefficients of the filter of the current block includes: Determining the filtering mode of the current position; Filtering the reconstructed value of the current position according to the reference position of the current position, the filtering mode of the current position, and the filter coefficients of the filter of the current block. The method of claim 34, wherein, The filtering mode of the current position is a first mode, the first mode is a bidirectional filtering mode based on inter prediction, the reference position of the current position includes a first reference position and a second reference position, wherein the first reference position is a position determined on a forward inter prediction reference image based on a first MV of the current position, and the second reference position is a position determined on a backward inter prediction reference image based on a second MV of the current position. The method of claim 35, wherein, The filtering of the reconstructed value of the current position according to the reference position of the current block, the filtering mode of the current position, and the filter coefficients of the filter of the current position includes: In a case where the filtering mode of the current position is the first mode, first input information of the filter of the current block is obtained according to the first reference position, the forward inter prediction reference image, and the shape of the filter of the current block; According to the second reference position, the backward inter-frame reference image and the shape of the filter of the current block, second input information of the filter of the current block is obtained; wherein the filter corresponding to the first input information and the second input information is the same filter; According to the first input information and the second input information, third input information is determined; According to the third input information and the filter coefficient of the filter of the current block, the reconstructed value of the current position is filtered. The method of claim 34, wherein, The filter mode of the current position is a second mode, and the second mode is a bi-directional filter mode based on a homologous position; the reference position of the current position includes a third reference position and a fourth reference position; wherein the third reference position is a homologous position of the current position on a forward nearest reference image of a current image, and the fourth reference position is a homologous position of the current position on a backward nearest reference image of the current image. The method of claim 37, wherein, According to the reference position of the current position, the filter mode of the current position and the filter coefficient of the filter of the current block, the reconstructed value of the current position is filtered, including: In the case that the filter mode of the current position is the second mode, fourth input information of the filter of the current block is obtained according to the third reference position, the forward nearest reference image and the shape of the filter of the current block; According to the fourth reference position, the backward nearest reference image and the shape of the filter of the current block, fifth input information of the filter of the current block is obtained; wherein the filter corresponding to the fourth input information and the fifth input information is the same filter; According to the fourth input information and the fifth input information, sixth input information is determined; According to the sixth input information and the filter coefficient of the filter of the current block, the reconstructed value of the current position is filtered. The method of claim 34, wherein, According to the reference position of the current position, the filter mode of the current position and the filter coefficient of the filter of the current block, the reconstructed value of the current position is filtered, including: In the case that the filter mode of the current position is any one of a third mode to a sixth mode, seventh input information of the filter of the current block is obtained according to the reference position of the current position, the image where the reference position is located and the shape of the filter of the current block; wherein the third mode to the sixth mode are all unidirectional filter modes; According to the seventh input information and the filter coefficient of the filter of the current block, the reconstructed value of the position is filtered. The method of any one of claims 34-39, wherein, The method further includes: In the case that it is determined that the current image or the current slice uses the reference position-based filter technology, the filter mode of the current image or the current slice is determined; According to the filter mode of the current image or the current slice, the value of the eighteenth syntax element is determined; The value of the eighteenth syntax element is encoded, and the obtained encoded bits are written into a bitstream. According to any one of claims 20-40, wherein, In the case that the current block satisfies a first condition, the filtering of the reconstructed value of the current position is skipped; The first condition comprises one or more of the following conditions: The prediction mode of the current block does not select an inter prediction mode; The prediction mode of the current block does not select an IBC prediction mode; The values of the residual samples of the current block are all 0; The prediction mode of the current block is an inter prediction mode, and a horizontal displacement component in a used motion vector is greater than or equal to a first threshold and / or a vertical displacement component in the used motion vector is greater than or equal to a second threshold; The prediction mode of the current block is an IBC prediction mode, and a horizontal displacement component in a used block vector is greater than or equal to a third threshold and / or a vertical displacement component in the used motion vector is greater than or equal to a fourth threshold. An encoder, comprising a first determining unit and a first filtering unit, wherein: The first determining unit is configured to determine a filter-related parameter of a current block, the filter-related parameter being a filter-related parameter of a filter based on a reference position; wherein the reference position comprises a position pointed by a BV of the current position, a position pointed by an MV of the current position, or a collocated position of the current position on a reference picture; The first determining unit is configured to determine a filter coefficient of the filter of the current block according to the filter-related parameter of the current block; The first filtering unit is configured to filter a reconstructed value of the current position according to the reference position of the current position and the filter coefficient of the filter of the current block. An encoder, comprising a first memory and a first processor, wherein: The first memory is configured to store a computer program capable of running on the first processor; The first processor is configured to execute the method according to any one of claims 20 to 41 when running the computer program. A decoder, comprising a first obtaining unit, a second determining unit and a second filtering unit, wherein: The first obtaining unit is configured to obtain a filter-related parameter of a current block from an APS, the filter-related parameter being a filter-related parameter of a filter based on a reference position; wherein the reference position comprises a position pointed by a BV of the current position, a position pointed by an MV of the current position, or a collocated position of the current position on a reference picture; The second determining unit is configured to determine a filter coefficient of the filter of the current block according to the filter-related parameter of the current block; The second filtering unit is configured to filter a reconstructed value of the current position according to the reference position of the current position and the filter coefficient of the filter of the current block. A decoder, comprising a second memory and a second processor, wherein: The second memory is configured to store a computer program capable of running on the second processor; The second processor is configured to execute the method according to any one of claims 1 to 19 when running the computer program. A computer-readable storage medium having stored thereon a computer program, wherein, The computer program is executed by a processor to implement the method according to any one of claims 1 to 19, or to implement the method according to any one of claims 20 to 41. A computer-readable storage medium having a code stream stored thereon, wherein, The code stream is generated by performing the steps of the encoding method according to any one of claims 20 to 41. The code stream is generated by performing the steps of the encoding method according to any one of claims 20 to 41.