Presence and relative decoding order of neural network post-processing filter sei messages

By specifying the existence conditions and order rules of neural network post-processing filter activation messages in the video bitstream, the problems of information redundancy and decoding inconsistency are solved, thereby improving the efficiency and consistency of video encoding and decoding.

CN119732046BActive Publication Date: 2026-06-23DOUYIN CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
DOUYIN CO LTD
Filing Date
2023-08-16
Publication Date
2026-06-23

Smart Images

  • Figure CN119732046B_ABST
    Figure CN119732046B_ABST
Patent Text Reader

Abstract

A mechanism of processing video data is disclosed. The mechanism includes performing a conversion between visual media data and a bitstream based on a rule. The rule specifies that a neural network post-filter activation (NNPFA) supplemental enhancement information (SEI) message having a first particular value of a NNPFA identifier is only present in a current picture unit (PU) when one or both of the following conditions are met. First, a currently coded layer video sequence (CLVS) includes a neural network post-filter characteristics (NNPFC) SEI message in a previous PU that precedes the current PU in decoding order having a NNPFC identifier (nnpfc_id) equal to the first particular value of the NNPFA identifier. Second, the current PU includes a NNPFC SEI message having the nnpfc_id equal to the first particular value of the NNPFA identifier.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] Cross-references to related applications

[0002] This application claims the benefit of U.S. Provisional Application No. 63 / 398,694, filed August 17, 2022, the teachings and disclosures of which are hereby incorporated herein by reference in their entirety. Technical Field

[0003] This patent document relates to the generation, storage, and use of digital audio and video media information in file formats. Background Technology

[0004] Digital video accounts for the largest share of bandwidth used on the Internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, the bandwidth demand for digital video is likely to continue to grow. Summary of the Invention

[0005] The first aspect relates to a method for processing media data, comprising: performing a conversion between visual media data and a bitstream based on rules; wherein the rules specify that an NNPFA Supplemental Enhancement Information (SEI) message having a first specific value of a Neural Network Post-Processing Filter Activation (NNPFA) identifier will exist in the current picture unit (PU) if one or both of the following conditions are met: a) the currently encoded / decoded layer video sequence (CLVS) includes a Neural Network Post-Processing Filter Feature (NNPFC) SEI message from a previous PU preceding the current PU in decoding order, the NNPFC identifier (nnpfc_id) being equal to the first specific value of the NNPFA identifier; or b) the current PU includes an NNPFC SEI message whose nnpfc_id is equal to the first specific value of the NNPFA identifier.

[0006] The second aspect relates to a non-transitory computer-readable recording medium storing a bitstream of video generated by a method performed by a video processing apparatus, wherein the method includes: performing a conversion between visual media data and the bitstream based on rules; wherein the rules specify that an NNPFA supplementary enhancement information (SEI) message having a first specific value of a neural network post-processing filter activation (NNPFA) identifier will exist in the current picture unit (PU) if one or both of the following conditions are met: a) the currently encoded / decoded layer video sequence (CLVS) includes a neural network post-processing filter feature (NNPFC) SEI message from a previous PU preceding the current PU in decoding order, the NNPFC identifier (nnpfc_id) being equal to the first specific value of the NNPFA identifier; or b) the current PU includes an NNPFC SEI message whose nnpfc_id is equal to the first specific value of the NNPFA identifier.

[0007] The third aspect relates to a method for storing a bitstream of video, comprising: determining the current value of an NNPFA identifier in a Neural Network Post-Processing Filter Activation (NNPFA) Supplemental Enhancement Information (SEI) message in a current picture unit (PU), wherein the NNPFA SEI message exists in the current PU only if: the currently encoded / decoded layer video sequence (CLVS) includes a Neural Network Post-Processing Filter Feature (NNPFC) SEI message in a previous PU preceding the current PU in decoding order, the NNPFC identifier (nnpfc_id) being equal to the previous NNPFA identifier; or the current PU includes an NNPFC SEI message in which nnpfc_id is equal to the current value of the NNPFA identifier; generating a bitstream based on the determination; and storing the bitstream in a non-transitory computer-readable recording medium.

[0008] The fourth aspect relates to an apparatus for processing video data, comprising: a processor; and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to perform any of the foregoing aspects.

[0009] The fifth aspect relates to a non-transitory computer-readable medium comprising a computer program product for use by a video codec apparatus, the computer program product comprising computer-executable instructions stored on the non-transitory computer-readable medium, causing the video codec apparatus to perform the methods of any of the preceding aspects when executed by a processor.

[0010] For clarity, any of the foregoing embodiments may be combined with one or more of the other foregoing embodiments to create new embodiments within the scope of this disclosure.

[0011] These and other features will become clearer from the following detailed description taken in conjunction with the accompanying drawings and claims. Attached Figure Description

[0012] To gain a more complete understanding of this disclosure, reference is now made to the following brief description taken in conjunction with the accompanying drawings and specific embodiments, wherein like reference numerals denote like parts.

[0013] Figure 1 This is an example illustration of a luminance data channel where nnpfc_inp_order_idc equals 3.

[0014] Figure 2 This is a block diagram illustrating an exemplary video processing system.

[0015] Figure 3 This is a block diagram of an example video processing apparatus.

[0016] Figure 4This is a flowchart of an example method for video processing.

[0017] Figure 5 This is a block diagram illustrating an example video codec system.

[0018] Figure 6 This is a block diagram illustrating an example encoder.

[0019] Figure 7 This is a block diagram illustrating an example decoder.

[0020] Figure 8 This is a schematic diagram of an example encoder.

[0021] Figure 9 This is a flowchart of an example method for video processing. Detailed Implementation

[0022] First, it should be understood that although illustrative implementations of one or more embodiments are provided below, any number of the systems and / or methods disclosed herein, whether currently known or yet to be developed, may be used. This disclosure should not be limited in any way to the exemplary embodiments, drawings, and techniques shown below, including the exemplary designs and implementations shown and described herein, but modifications may be made within the full scope of the appended claims and their equivalents.

[0023] The chapter headings used in this document are for ease of understanding and do not limit the applicability of the techniques and embodiments disclosed in each chapter to that chapter only. Furthermore, the use of H.266 terminology in some descriptions is merely for ease of understanding and not to limit the scope of the disclosed techniques. Therefore, the techniques described herein are also applicable to other video codec protocols and designs. In this document, regarding the Multi-Function Video Codec (VVC) specification and / or the Multi-Function Supplemental Enhancement Information (VSEI) standard, edit changes to the text are indicated by bold italics to indicate undoing text and bold text to indicate addition text.

[0024] 1. Preliminary Discussion

[0025] This document relates to image and / or video codec techniques. Specifically, this disclosure relates to the presence and order of Neural Network Post-Processing Filter Supplemental Enhancement Information (SEI) messages used for signaling notification of neural network post-processing filters in a video bitstream. These examples can be applied individually or in various combinations for video bitstreams encoded and decoded by any codec, such as the VVC standard and / or the Video Bitstream for Encoding and Decoding (VSEI) standard.

[0026] 2. Abbreviation

[0027] Adaptive Parameter Set (APS), Access Unit (AU), Layer Video Sequence (CLVS), Start of Layer Video Sequence (CLVSS), Cyclic Redundancy Check (CRC), Video Sequence (CVS), Finite Impulse Response (FIR), Intra-Frame Random Access Point (IRAP), Network Abstraction Layer (NAL), Picture Parameter Set (PPS), Picture Unit (PU), Random Access Skip Prefix (RASL), Supplemental Enhancement Information (SEI), Stepped Time Sublayer Access (STSA), Video Coding Layer (VCL), Multifunctional Supplemental Enhancement Information (VSEI) as described in Rec. ITU-T H.274|ISO / IEC 23002-7, Video Availability Information (VUI), Multifunctional Video Coding (VVC) as described in Rec. ITU-T H.266|ISO / IEC 23090-3.

[0028] 3. Further discussion

[0029] 3.1 Video codec standards

[0030] Video coding standards have evolved primarily through the development of standards by the International Telecommunication Union (ITU) Telecommunication Standardization Sector (ITU-T) and the International Organization for Standardization (ISO) / International Electrotechnical Commission (IEC). ITU-T produced H.261 and H.263, ISO / IEC produced Moving Picture Experts Group (MPEG-1) and MPEG-4 Vision, and the two organizations jointly produced the H.262 / MPEG-2 Video and H.264 / MPEG-4 Advanced Video Coding (AVC) and H.265 / High Efficiency Video Coding (HEVC) standards[1]. Since H.262, video coding standards have been based on a hybrid video coding architecture that utilizes temporal prediction plus transform coding. In order to explore future video coding technologies beyond HEVC, the Video Coding Experts Group (VCEG) and MPEG jointly established the Joint Video Exploration Group (JVET). JVET adopted many methods and put them into a reference software called the Joint Exploration Model (JEM)[2]. When the Multifunctional Video Coding (VVC) project was officially launched, JVET was renamed the Joint Video Experts Group (JVET). VVC[3] is a coding standard that aims to reduce the bit rate by 50% compared to HEVC.

[0031] The Multi-Functional Video Coding (VVC) standard (ITU-TH.266|ISO / IEC 23090-3)[3][4] and the associated Multi-Functional Supplemental Enhancement Information (VSEI) standard (ITU-TH.274|ISO / IEC 23002-7)[4] are designed for the widest range of applications, including traditional uses such as television broadcasting, video conferencing or playback from storage media, as well as newer and more advanced use cases such as adaptive bitrate streaming, video region extraction, synthesis and merging of content from video bitstreams from multiple codecs, multi-view video, scalable layered codecs and viewport adaptive 360° immersive media.

[0032] The Basic Video Codec (EVC) standard (ISO / IEC 23094-1) is another video codec standard developed by MPEG.

[0033] 3.2 General SEI Messages and SEI Messages in VVC and VSEI

[0034] SEI messages assist in processes associated with decoding, display, or other purposes. However, the decoding process does not require SEI messages to construct luma or chroma samples. The consistency decoder does not need to process this information to achieve output order consistency. Checking bitstream consistency and output timing decoder consistency requires some SEI messages. Checking bitstream consistency does not require additional SEI messages.

[0035] Annex D of the VVC specifies the syntax and semantics of the SEI message payload for some SEI messages, and specifies the use of SEI messages and VUI parameters whose syntax and semantics are specified in ITU-TH.274|ISO / IEC 23002-7.

[0036] 3.3 Signaling Notification for Neural Network Post-Processing Filters

[0037] JVET-AA2006[5] includes provisions for two SEI messages for signaling notifications of neural network post-processing filters, as shown below.

[0038] 8.28 Neural Network Post-Processing Filter Characteristics SEI Message

[0039] 8.28.1 Characteristics of Neural Network Post-Processing Filters and SEI Message Syntax

[0040]

[0041]

[0042]

[0043] 8.28.2 Characteristics of Neural Network Post-Processing Filters and Semantics of SEI Messages

[0044] The SEI message specifies the neural networks that can be used as post-processing filters. Activating the SEI message via a neural network post-processing filter indicates that the specified post-processing filter should be used on a particular image.

[0045] Using this SEI message requires defining the following variables: – The width and height of the cropped decoded output image in units of luminance samples, denoted in this document as CroppedWidth and CroppedHeight, respectively. – The luminance sample arrays CroppedYPic[x][y], CroppedCbPic[x][y], and CroppedCrPic[x][y] (if present) of the cropped decoded output image for vertical coordinate y and horizontal coordinate x, where the coordinates of the top-left corner of the sample array are y=0 and x=0. – The bit depth BitDepthY of the luminance sample array of the cropped decoded output image. – The bit depth BitDepthC of the chrominance sample array (if present) of the cropped decoded output image. – The chrominance format indicator, denoted in this document as ChromaFormatIdc, as described in Section 7.3. – The quantization strength value StrengthControlVal when nnpfc_auxiliary_inp_idc equals 1.

[0046] When the SEI message specifies a neural network that can be used as a post-processing filter, the semantic specification includes the derivation of the output luminance sample array FilteredYPic[x][y] and chrominance sample arrays FilteredCbPic[x][y] and FilteredCrPic[x][y], as indicated by the value of nnpfc_out_order_idc. The variables SubWidthC and SubHeightC are derived from ChromaFormatIdc as specified in Table 2.

[0047] nnpfc_id includes an identifier that can be used to identify the post-processing filter. The value of nnpfc_id should be between 0 and 2. 32 The range is -2 (inclusive of endpoints). The value of nnpfc_id is from 256 to 511 (inclusive of endpoints) and from 2... 31 Up to 2 32 -2 (including the endpoint) is reserved for future use by ITU-T|ISO / IEC. The decoder encounters values ​​in the range of 256 to 511 (including the endpoint) or in 2... 31 Up to 2 32When the nnpfc_id value is in the range of -2 (inclusive), it should be ignored. An nnpfc_mode_idc value of 0 indicates that the post-processing filter associated with the nnpfc_id value is determined by an external method not specified in this specification. An nnpfc_mode_idc value of 1 indicates that the post-processing filter associated with the nnpfc_id value is a neural network represented by the ISO / IEC 15938-17 bitstream included in this SEI message.

[0048] A value of 2 for nnpfc_mode_idc indicates that the post-processing filter associated with the nnpfc_id value is a neural network identified by a specified Uniform Resource Identifier (URI) (nnpfc_uri_tag[i]) and a Neural Network Information URI (nnpfc_uri[i]). The value of nnpfc_mode_idc should be in the range of 0 to 255 (inclusive). Values ​​of nnpfc_mode_idc greater than 2 are reserved for future ITU-T|ISO / IEC specifications and should not exist in bitstreams conforming to this version of the specification. Decoders conforming to this version of the specification should ignore SEI messages containing reserved values ​​of nnpfc_mode_idc. A value of 0 for nnpfc_purpose_and_formatting_flag indicates that no syntax elements related to the filter's purpose, input format, output format, and complexity exist. A value of 1 for nnpfc_purpose_and_formatting_flag indicates that syntax elements related to the filter's purpose, input format, output format, and complexity exist. When nnpfc_mode_idc equals 1 and the current CLVS does not include the previous neural network post-processing filter feature SEI message in decoding order (whose nnpfc_id value is equal to the nnpfc_id value in that SEI message), nnpfc_purpose_and_formatting_flag should be equal to 1.

[0049] When the current CLVS includes a previous neural network post-processing filter feature SEI message in decoding order (whose nnpfc_id value is equal to the nnpfc_id value in that SEI message), at least one of the following conditions should apply: – The nnpfc_mode_idc of that SEI message is equal to 1, and the nnpfc_purpose_and_formatting_flag is equal to 0, in order to provide neural network updates. – The SEI message has the same content as the previous neural network post-processing filter feature SEI message. When the SEI message is the first neural network post-processing filter feature SEI message in decoding order (which has a specific nnpfc_id value within the current CLVS), it specifies the basic post-processing filter associated with the current decoded image of the current layer and all subsequent decoded images in output order, until the end of the current CLVS. When the SEI message is not the first neural network post-processing filter feature SEI message in decoding order (which has a specific nnpfc_id value within the current CLVS), the SEI message is associated with the current decoded image of the current layer in output order and all subsequent decoded images until the end of the current CLVS, or until the next neural network post-processing filter feature SEI message in output order within the current CLVS has a specific nnpfc_id value.

[0050] `nnpfc_purpose` indicates the purpose of the post-processing filter, as specified in Table 20. The value of `nnpfc_purpose` should be between 0 and 2. 32 The range is -2 (inclusive). Values ​​of nnpfc_purpose not appearing in Table 20 are reserved for future ITU-T|ISO / IEC specifications and should not be present in bitstreams conforming to this version of the specification. Decoders conforming to this version of the specification should ignore SEI messages that include reserved values ​​of nnpfc_purpose.

[0051] Table 20 – Definition of nnpfc_purpose

[0052]

[0053] Note 1 – When ITU-T|ISO / IEC uses the reserved value of nnpfc_purpose in the future, the syntax of this SEI message can be extended by syntax elements, the existence of which is conditional on nnpfc_purpose being equal to that value.

[0054] When SubWidthC equals 1 and SubHeightC equals 1, nnpfc_purpose should not equal 2 or 4. nnpfc_out_sub_c_flag equals 1, which specifies that outSubWidthC equals 1 and outSubHeightC equals 1. nnpfc_out_sub_c_flag equals 0, which specifies that outSubWidthC equals 2 and outSubHeightC equals 1. When nnpfc_out_sub_c_flag does not exist, outSubWidthC is inferred to be equal to SubWidthC, and outSubHeightC is inferred to be equal to SubHeightC. If SubWidthC equals 2 and SubHeightC equals 1, then nnpfc_out_sub_c_flag should not equal 0.

[0055] `nnpfc_pic_width_in_luma_samples` and `nnpfc_pic_height_in_luma_samples` specify the width and height of the luminance sample array of the image generated by applying the post-processing filter identified by `nnpfc_id` to the cropped decoded output image, respectively. When `nnpfc_pic_width_in_luma_samples` and `nnpfc_pic_height_in_luma_samples` do not exist, they are inferred to be equal to `CroppedWidth` and `CroppedHeight`, respectively. `nnpfc_component_last_flag` equal to 0 specifies that the second dimension of the input tensor `inputTensor` and the output tensor `outputTensor` produced by the post-processing filter are used for this channel. `nnpfc_component_last_flag` equal to 1 specifies that the last dimension of the input tensor `inputTensor` and the output tensor `outputTensor` produced by the post-processing filter are used for this channel.

[0056] Note 2 – The first dimension in both the input and output tensors is used for batch indexing, a practice common in some neural network frameworks. While the semantics of this SEI message use a batch size equal to 1, the batch size used as input for neural network inference depends on the post-processing implementation. Note 3 – The color component is an example of a channel.

[0057] `nnpfc_inp_format_flag` indicates the method for converting the sample values ​​of the cropped decoded output image into the input values ​​of the post-processing filter. When `nnpfc_inp_format_flag` equals 0, the input values ​​of the post-processing filter are real numbers, and the functions `InpY` and `InpC` are specified as follows:

[0058] InpY(x) = x ÷ ((1< <BitDepth Y ) - 1 ) (75)

[0059] InpC( x )= x ÷ ( ( 1<< BitDepth C ) - 1 ) (76)

[0060] When nnpfc_inp_format_flag equals 1, the input value of the post-processing filter is an unsigned integer, and the functions InpY and InpC are specified as follows:

[0061]

[0062] The variable `inpTensorBitDepth` is deduced from the syntax element `nnpfc_inp_tensor_bitdepth_minus8`, as specified below. `nnpfc_inp_tensor_bitdepth_minus8` plus 8 specifies the bit depth of the luminance sample values ​​in the input integer tensor. The value of `inpTensorBitDepth` is derived as follows:

[0063] inpTensorBitDepth = nnpfc_inp_tensor_bitdepth_minus8 + 8 (78)

[0064] Bitstream consistency requires that the value of nnpfc_inp_tensor_bitdepth_minus8 be in the range of 0 to 24, including the endpoints.

[0065] A non-zero value for `nnpfc_auxiliary_inp_idc` indicates that the auxiliary input data exists in the input tensor of the neural network post-processing filter. A value of 0 for `nnpfc_auxiliary_inp_idc` indicates that the auxiliary input data does not exist in the input tensor. A value of 1 for `nnpfc_auxiliary_inp_idc` indicates that the auxiliary input data is derived as specified in Table 23. The value of `nnpfc_auxiliary_inp_idc` should be in the range of 0 to 255, inclusive. Values ​​of `nnpfc_auxiliary_inp_idc` greater than 1 are reserved for future specifications of ITU-T|ISO / IEC and should not exist in bitstreams conforming to this version of the specification. Decoders conforming to this version of the specification should ignore SEI messages that include reserved values ​​of `nnpfc_auxiliary_inp_idc`.

[0066] `nnpfc_separate_colour_description_present_flag` equal to 1 indicates that a different combination of primary colors, transitivity, and matrix coefficients of the image produced by the post-processing filter is specified in the SEI message syntax structure. `nnfpc_separate_colour_description_present_flag` equal to 0 indicates that the combination of primary colors, transitivity, and matrix coefficients of the image produced by the post-processing filter is the same as indicated in the CLVS VUI parameters.

[0067] `nnpfc_colour_primaries` has the same semantics as the `vui_colour_primaries` syntax element in Section 7.3, except in the following cases: – `nnpfc_colour_primaries` specifies the primary colors of the image produced by the neural network post-processing filter specified in the SEI message, rather than the primary colors used for CLVS. – When `nnpfc_colour_primaries` is not present in the neural network post-processing filter characteristic SEI message, the value of `nnpfc_colour_primaries` is inferred to be equal to `vui_colour_primaries`. `nnpfc_transfer_characteristics` has the same semantics as the `vui_transfer_characteristics` syntax element in Section 7.3, except in the following cases: – `nnpfc_transfer_characteristics` specifies the transfer characteristics of the image produced by the neural network post-processing filter specified in the SEI message, rather than the transfer characteristics used for CLVS. – When nnpfc_transfer_characteristics is not present in the SEI message of the neural network post-processing filter characteristics, the value of nnpfc_transfer_characteristics is inferred to be equal to vui_transfer_characteristics.

[0068] `nnpfc_matrix_coeffs` has the same semantics as the `vui_matrix_coeffs` syntax element specified in Section 7.3, except in the following cases: – `nnpfc_matrix_coeffs` specifies the matrix coefficients of the image produced by the neural network post-processing filter applied in the SEI message, rather than the matrix coefficients used for CLVS. – When `nnpfc_matrix_coeffs` is not present in the neural network post-processing filter feature SEI message, the value of `nnpfc_matrix_coeffs` is inferred to be equal to `vui_matrix_coeffs`. – Allowed values ​​for `nnpfc_matrix_coeffs` are not constrained by the chroma format of the decoded video image, which is indicated by the `ChromaFormatIdc` value of the VUI parameter semantics. – When `nnpfc_matrix_coeffs` equals 0, `nnpfc_out_order_idc` should not be equal to 1 or 3.

[0069] `nnpfc_inp_order_idc` indicates the method for ordering the sample array of the cropped decoded output image as input to the post-processing filter. Table 21 includes information describing the values ​​of `nnpfc_inp_order_idc`. Table 23 specifies the semantics of `nnpfc_inp_order_idc` in the range of 0 to 3 (inclusive). This table defines the process for deriving the input tensor `inputTensor` for different `nnpfc_inp_order_idc` values ​​and given vertical sample coordinates `cTop` and horizontal sample coordinates `cLeft` (which specify the position of the top-left sample of the sample block included in the input tensor). When the chroma format of the cropped decoded output image is not 4:2:0, `nnpfc_inp_order_idc` should not be equal to 3. The value of `nnpfc_inp_order_idc` should be in the range of 0 to 255 (inclusive). Values ​​greater than 3 for nnpfc_inp_order_idc are reserved for future ITU-T|ISO / IEC specifications and should not exist in bitstreams conforming to this version of the specification. Decoders conforming to this version of the specification should ignore SEI messages that include reserved values ​​of nnpfc_inp_order_idc.

[0070] Table 21 – Description of information regarding nnpfc_inp_order_idc values

[0071]

[0072]

[0073] Figure 1 This is an example illustration 100 (informative) of the luminance data channel where nnpfc_inp_order_idc equals 3.

[0074] A patch is a rectangular array of samples from one component of the image (e.g., the luminance or chrominance component). `nnpfc_constant_patch_size_flag` equal to 0 specifies that the post-processing filter accepts any patch size as input that is a positive integer multiple of the patch size indicated by `nnpfc_patch_width_minus1` and `nnpfc_patch_height_minus1`. When `nnpfc_constant_patch_size_flag` equals 0, the patch width should be less than or equal to `CroppedWidth`. When `nnpfc_constant_patch_size_flag` equals 0, the patch height should be less than or equal to `CroppedHeight`. `nnpfc_constant_patch_size_flag` equal to 1 specifies that the post-processing filter accepts exactly the patch size indicated by `nnpfc_patch_width_minus1` and `nnpfc_patch_height_minus1` as input.

[0075] When nnpfc_constant_patch_size_flag equals 1, nnpfc_patch_width_minus1+1 specifies the horizontal sample count of the input block size for the post-processing filter. When nnpfc_constant_patch_size_flag equals 0, any positive integer multiple of (nnpfc_patch_width_minus1+1) can be used as the horizontal sample count of the input block size for the post-processing filter. The value of nnpfc_patch_width_minus1 should be in the range of 0 to Min(32766, CroppedWidth-1) (inclusive). When nnpfc_constant_patch_size_flag equals 1, nnpfc_patch_height_minus1+1 specifies the vertical sample count of the input block size for the post-processing filter. When `nnpfc_constant_patch_size_flag` equals 0, any positive integer multiple of `nnpfc_patch_height_minus1 + 1` can be used as the vertical sample count for the block size used as the input to the post-processing filter. The value of `nnpfc_patch_height_minus1` should be in the range of 0 to Min(32766, CroppedHeight - 1) (inclusive). `nnpfc_overlap` specifies the horizontal and vertical sample counts of the overlap between adjacent input tensors of the post-processing filter. The value of `nnpfc_overlap` should be in the range of 0 to 16383 (inclusive).

[0076] The variables inpPatchWidth, inpPatchHeight, outPatchWidth, outPatchHeight, horCScaling, verCScaling, outPatchCWidth, outPatchCHeight, and overlapSize are derived as follows:

[0077]

[0078]

[0079] Bitstream consistency requires that `outPatchWidth * CroppedWidth` should equal `nnpfc_pic_width_in_luma_samples * inpPatchWidth`, and `outPatchHeight * CroppedHeight` should equal `nnpfc_pic_height_in_luma_samples * inpPatchHeight`. `nnpfc_padding_type` specifies the padding process when referencing sample positions outside the boundaries of the cropped decoded output image, as shown in Table 22. The value of `nnpfc_padding_type` should be in the range of 0 to 15 (inclusive).

[0080] Table 22 – Description of information regarding nnpfc_padding_type values

[0081] nnpfc_padding_type describe 0 Zero fill 1 Copy fill 2 Reflection fill 3 wraparound fill 4 Fixed fill 5..15 reserve

[0082] `nnpfc_luma_padding_val` specifies the luminance value used for padding when `nnpfc_padding_type` equals 4. `nnpfc_cb_padding_val` specifies the Cb value used for padding when `nnpfc_padding_type` equals 4. `nnpfc_cr_padding_val` specifies the Cr value used for padding when `nnpfc_padding_type` equals 4.

[0083] The function InpSampleVal(y,x,picHeight,picWidth,croppedPic) (whose inputs are the vertical sample position y, the horizontal sample position x, the image height picHeight, the image width picWidth, and the sample array croppedPic) returns the sampleVal value derived as follows:

[0084]

[0085]

[0086] Table 23 - Procedure for deriving the input tensor inputTensor for a given vertical sample coordinate cTop and horizontal sample coordinate cLeft (which specifies the position of the top-left sample of the sample block included in the input tensor).

[0087]

[0088]

[0089]

[0090]

[0091] A value greater than 0 for nnpfc_complexity_idc indicates the possible presence of one or more syntax elements indicating the complexity of the post-processing filter associated with nnpfc_id. A value equal to 0 for nnpfc_complexity_idc indicates the absence of a syntax element indicating the complexity of the post-processing filter associated with nnpfc_id. The value of nnpfc_complexity_idc should be in the range of 0 to 255 (inclusive). Values ​​of nnpfc_complexity_idc greater than 1 are reserved for future specifications of ITU-T|ISO / IEC and should not exist in bitstreams conforming to this version of the specification. Decoders conforming to this version of the specification should ignore SEI messages that include reserved values ​​of nnpfc_complexity_idc.

[0092] The nnpfc_out_format_flag being equal to 0 indicates that the sample values ​​output by the post-processing filter are real numbers, and the functions OutY and OutC used to convert the luminance and chrominance sample values ​​output by the post-processing filter to integer values ​​in bit depths BitDepthY and BitDepthC, respectively, are specified as follows:

[0093] OutY( x ) = Clip3( 0, ( 1 << BitDepth Y ) - 1, Round( x * ( ( 1 < <BitDepth Y ) - 1 ) ) ) (81)

[0094] OutC( x )= Clip3( 0, ( 1 << BitDepth C) - 1, Round( x * ( ( 1 < <BitDepth C ) - 1 ) ) ) (82)

[0095] The nnpfc_out_format_flag being equal to 1 indicates that the sample values ​​output by the post-processing filter are unsigned integers, and the functions OutY and OutC are specified as follows:

[0096]

[0097] The variable `outTensorBitDepth` is deduced from the syntax element `nnpfc_out_tensor_bitdepth_minus8` as described below. `nnpfc_out_tensor_bitdepth_minus8` plus 8 specifies the bit depth of the sample values ​​in the output integer tensor. The value of `outTensorBitDepth` is deduced as follows:

[0098] outTensorBitDepth = nnpfc_out_tensor_bitdepth_minus8 + 8 (84)

[0099] Bitstream consistency requires that the value of nnpfc_out_tensor_bitdepth_minus8 be in the range of 0 to 24 (inclusive).

[0100] `nnpfc_out_order_idc` indicates the output order of samples generated by the post-processing filter. Table 24 includes information describing the value of `nnpfc_out_order_idc`. Table 25 specifies the semantics of `nnpfc_out_order_idc` in the range of 0 to 3 (inclusive). This table specifies the process of deriving the sample values ​​in the filtered output sample arrays `FilteredYPic`, `FilteredCbPic`, and `FilteredCrPic` from the output tensor `outputTensor` for different values ​​of `nnpfc_out_order_idc` and given the vertical sample coordinates `cTop` and the horizontal sample coordinates `cLeft` of the top-left sample position of the sample block included in the input tensor. When `nnpfc_purpose` is equal to 2 or 4, `nnpfc_out_order_idc` should not be equal to 3. The value of `nnpfc_out_order_idc` should be in the range of 0 to 255 (inclusive). Values ​​greater than 3 for nnpfc_out_order_idc are reserved for future ITU-T|ISO / IEC specifications and should not be present in bitstreams conforming to this version of the specification. Decoders conforming to this version of the specification should ignore SEI messages that include reserved values ​​of nnpfc_out_order_idc.

[0101] Table 24 – Description of information regarding nnpfc_out_order_idc values

[0102]

[0103] Table 25 – Procedure for deriving the sample values ​​in the filtered output sample arrays FilteredYPic, FilteredCbPic, and FilteredCrPic from the output tensor for different values ​​of the given vertical sample coordinates cTop and horizontal sample coordinates cLeft (which specify the position of the top-left sample of the sample block included in the input tensor).

[0104]

[0105]

[0106]

[0107] The basic post-processing filter of the cropped decoded output image picA is a filter identified by the SEI message characteristic of the first neural network post-processing filter in the decoding order that has a specific nnpfc_id value within CLVS.

[0108] If another neural network post-processing filter feature SEI message has the same nnpfc_id value, its nnpfc_mode_idc is equal to 1, its content is different from the neural network post-processing filter feature SEI message that defines the basic post-processing filter, and it belongs to image picA, then the basic post-processing filter is updated by decoding the ISO / IEC 15938-17 bitstream in that neural network post-processing filter feature SEI message to obtain the post-processing filter PostProcessingFilter(). Otherwise, the post-processing filter PostProcessingFilter() is assigned to be the same as the basic post-processing filter.

[0109] The process of using the PostProcessingFilter() postprocessing filter to filter the cropped decoded output image to generate a filtered image is as follows. The image includes Y, Cb, and Cr sample arrays FilteredYPic, FilteredCbPic, and FilteredCrPic, as indicated by nnpfc_out_order_idc.

[0110]

[0111] nnpfc_reserved_zero_bit should be equal to 0.

[0112] `nnpfc_uri_tag[i]` includes a NULL-terminated Unicode Transformation Format (UTF)-8 string that specifies the tag URI. The UTF-8 string includes the URI, whose syntax and semantics are as specified in IETF Request for Comments (RFC) 4151, uniquely identifying the format and associated information of the neural network used as the post-processing filter specified in the `nnrpf_uri[i]` value. Note 4 – The `nnrpf_uri_tag[i]` element represents a 'tag' URI, which allows for the unique identification of the format of neural network data specified by the `nnrpf_uri[i]` value without the need for a central registry. `nnpfc_uri[i]` includes a NULL-terminated UTF-8 string, as specified in IETF Internet Standard 63. The UTF-8 string includes the URI, whose syntax and semantics are as specified in IETF Internet Standard 66, identifying neural network information (e.g., data representation) used as the post-processing filter. nnpfc_payload_byte[i] includes the i-th byte of a bitstream conforming to ISO / IEC 15938-17. The sequence of bytes for all current i values ​​nnpfc_payload_byte[i] should be a complete bitstream conforming to ISO / IEC 15938-17.

[0113] `nnpfc_parameter_type_idc` equal to 0 indicates that the neural network uses only integer parameters. `nnpfc_parameter_type_flag` equal to 1 indicates that the neural network can use floating-point or integer parameters. `nnpfc_parameter_type_idc` equal to 2 indicates that the neural network uses only binary parameters. `nnpfc_parameter_type_idc` equal to 3 is reserved for future ITU-T|ISO / IEC specifications and should not exist in bitstreams conforming to this version of the specification. Decoders conforming to this version of the specification should ignore SEI messages that include reserved values ​​of `nnpfc_parameter_type_idc`. `nnpfc_log2_parameter_bit_length_minus3` equal to 0, 1, 2, and 3 indicate that the neural network does not use parameters with bit lengths greater than 8, 16, 32, and 64, respectively. When `nnpfc_parameter_type_idc` exists and `nnpfc_log2_parameter_bit_length_minus3` does not exist, the neural network does not use parameters with bit lengths greater than 1. `nnpfc_num_parameters_idc` indicates the maximum number of neural network parameters for the post-processing filter, in powers of 2048. `nnpfc_num_parameters_idc` equal to 0 indicates that the maximum number of neural network parameters is not specified. The value of `nnpfc_num_parameters_idc` should be in the range of 0 to 52 (inclusive). Values ​​of `nnpfc_num_parameters_idc` greater than 52 are reserved for future ITU-T|ISO / IEC specifications and should not exist in bitstreams conforming to this version of the specification. Decoders conforming to this version of the specification should ignore SEI messages that include reserved values ​​of `nnpfc_num_parameters_idc`. If the value of `nnpfc_num_parameters_idc` is greater than zero, the variable `maxNumParameters` is deduced as follows:

[0114] maxNumParameters = (2048 << nnpfc_num_parameters_idc) - 1 (86)

[0115] Bitstream consistency requires that the number of neural network parameters in the post-processing filter be less than or equal to `maxNumParameters`. `nnpfc_num_kmac_operations_idc` being greater than 0 indicates that the maximum number of multiplication-accumulation operations per sample in the post-processing filter is less than or equal to `nnpfc_num_kmac_operations_idc * 1000`. `nnpfc_num_kmac_operations_idc` being equal to 0 indicates that the maximum number of multiplication-accumulation operations in the network is not specified. The value of `nnpfc_num_kmac_operations_idc` should be between 0 and 2. 32 The range is -1 (including the endpoints).

[0116] 8.29 Neural Network Post-Processing Filter Activation SEI Message

[0117] 8.29.1 Neural Network Post-Processing Filter Activation SEI Message Syntax

[0118]

[0119] 8.29.2 Neural Network Post-Processing Filter Activation of SEI Message Semantics

[0120] This SEI message specifies the neural network post-processing filters that can be used for post-processing filtering of the current image. The neural network post-processing filter activation SEI message is maintained only for the current image. Note—For the same image, for example, multiple neural network post-processing filter activation SEI messages may exist, such as when post-processing filters are used for different purposes or to filter different color components. `nnpfa_id` specifies that one or more neural network post-processing filter characteristics SEI messages belonging to the current image and whose `nnpfc_id` is equal to `nnfpa_id` can be used for post-processing filtering of the current image.

[0121] 4. The technical problem solved by the disclosed technical solution

[0122] The example designs for Neural Network Post-Processing Filter Feature (NNPFC) SEI messages and Neural Network Post-Processing Filter Activation (NNPFA) SEI messages have the following problems.

[0123] First, even if an NNPFC SEI message with the same neural network post-processing filter ID does not exist in the bitstream, an NNPFA SEI message with a specific neural network post-processing filter ID may exist in the image unit. However, in this case, the existence of the NNPFA SEI message would be meaningless.

[0124] Secondly, within an image unit, an NNPFA SEI message with a specific neural network post-processing filter ID may exist before an NNPFC SEI message with the same neural network post-processing filter ID, in the order of decoding. However, it would be clearer for an NNPFA SEI message to be located after its associated NNPFC SEI message in the order of decoding.

[0125] 5. List of solutions and implementation examples

[0126] To address the aforementioned issues, some summarized methods are presented below. These examples should be considered as illustrations of general concepts, rather than being interpreted narrowly. Furthermore, these examples can be applied individually or in combination in any way.

[0127] Example 1

[0128] The solution to the first problem is now described. In one example, it is stipulated that an NNPFA SEI message with a specific value of the Neural Network Post-Processing Filter Activation (NNPFA) identifier (nnpfa_id) should not exist in the current PU unless at least one of the following two conditions is true: a) In the current CLVS, there exists an NNPFC SEI message with a specific value of the Neural Network Post-Processing Filter Feature (NNPFC) identifier (nnpfc_id) equal to nnpfa_id in a PU that precedes the current PU in the decoding order; b) An NNPFC SEI message exists in the current PU.

[0129] Example 2

[0130] The solution to the second problem is now described. In one example, it is stipulated that when the PU includes both an NNPFC SEI message with a specific value of nnpfc_id and an NNPFA SEI message with nnpfa_id equal to a specific value of nnpfc_id, the NNPFC SEI message should precede the NNPFA SEI message in the decoding order.

[0131] 6. Example

[0132] The following are some example implementations of some of the disclosed content items summarized in Examples 1 to 4 of Section 5 above.

[0133] Most of the relevant parts that have been added or modified are shown in bold, while some deleted parts are shown in italicized and bold. There may also be other changes that are editorial in nature and therefore are not highlighted.

[0134] 6.1 First Embodiment

[0135] This embodiment refers to Examples 1 to 2 summarized in Section 5 above.

[0136] 8.29.2 Neural Network Post-Processing Filter Activation of SEI Message Semantics

[0137] The SEI message specifies the neural network post-processing filter that can be used for post-processing filtering of the current image.

[0138] The neural network post-processing filter activates the SEI message only for the current image.

[0139] Note—For the same image, for example, when post-processing filters are used for different purposes or to filter different color components, there may be multiple neural network post-processing filter activation SEI messages.

[0140]

[0141] nnpfa_id specifies that one or more neural network post-processing filter characteristics SEI messages for the current image, where nnpfc_id is equal to nnfpa_id, can be used for post-processing filtering of the current image. ...

[0143] 7. References

[0144] [1]ITU-T and ISO / IEC, "High efficiency video coding", Rec.ITU-T H.265|ISO / IEC 23008-2 (in force edition).

[0145] [2] J.Chen, E.Alshina, GJSullivan, J.-R.Ohm, J.Boyce, "Algorithmdescription of Joint Exploration Test Model 7(JEM7)," JVET-G1001, Aug.2017.

[0146] [3]Rec.ITU-T H.266|ISO / IEC 23090-3, "Versatile Video Coding", 2022.

[0147] [4]Rec.ITU-T Rec.H.274|ISO / IEC 23002-7, "Versatile SupplementalEnhancement Information Messages for Coded Video Bitstreams", 2022.

[0148] [5]S.McCarthy,T.Chujoh,M.Hannuksela,G.Sullivan,and Y.-K.Wang(editors),"Additional SEI messages for VSEI(Draft 2),"JVET output documentJVET-AA2006,publicly available online herein:https: / / www.jvet-experts.org / doc_end_user / current_document.php? id=11947.

[0149] Figure 2 This is a block diagram illustrating an example video processing system 4000 in which various techniques disclosed herein may be implemented. Various implementations may include some or all of the components of system 4000. System 4000 may include an input 4002 for receiving video content. The video content may be received in a raw or uncompressed format (e.g., 8-bit or 10-bit multi-component pixel values), or may be received in a compressed or encoded format. Input 4002 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interfaces include wired interfaces such as Ethernet, Passive Optical Networking (PON), etc., and wireless interfaces such as Wi-Fi or cellular interfaces.

[0150] System 4000 may include codec component 4004, which may implement the various codec or encoding methods described in this document. Codec component 4004 may reduce the average bit rate of the video from input 4002 to output to produce a codec representation of the video. Therefore, codec techniques are sometimes referred to as video compression or video transcoding techniques. The output of codec component 4004 may be stored or transmitted via connected communication, as represented by component 4006. The stored or transmitted bitstream (or codec) representation of the video received at input 4002 may be used by component 4008 to generate pixel values ​​or displayable video, which is sent to display interface 4010. The process of generating user-visible video from the bitstream representation is sometimes referred to as video decompression. Furthermore, although some video processing operations are referred to as “codec” operations or tools, it should be understood that codec tools or operations are used in the encoder, and the decoder will perform the corresponding decoding tool or operation that reverses the codec result.

[0151] Examples of peripheral bus interfaces or display interfaces may include Universal Serial Bus (USB), High Definition Multimedia Interface (HDMI), or DisplayPort. Examples of storage interfaces include Serial Advanced Technology Attachment (SATA), Peripheral Component Interconnect (PCI), Integrated Drive Electronics (IDE), and so on. The technologies described in this document can be found in a variety of electronic devices, such as mobile phones, laptops, smartphones, or other devices capable of performing digital data processing and / or video display.

[0152] Figure 3 This is a block diagram of an example video processing apparatus 4100. Apparatus 4100 can be used to implement one or more methods described herein. Apparatus 4100 can be embodied in a smartphone, tablet computer, computer, Internet of Things (IoT) receiver, etc. Apparatus 4100 may include one or more processors 4102, one or more memories 4104, and video processing circuitry 4106. Processor 4102 can be configured to implement one or more methods described herein. Memory (multiple memories) 4104 can be used to store data and code for implementing the methods and techniques described herein. Video processing circuitry 4106 can be used to implement some of the techniques described herein in hardware circuitry. In some embodiments, video processing circuitry 4106 may be at least partially included in processor 4102 (e.g., a graphics coprocessor).

[0153] Figure 4 This is a flowchart of an example method 4200 for video processing. Method 4200 includes determining, at step 4202, the Neural Network Post-Processing Filter Activation (NNPFA) identifier (nnpfa_id) in the NNPFA Supplemental Enhancement Information (SEI) message and the Neural Network Post-Processing Filter Feature (NNPFC) identifier (nnpfc_id) in the NNPFC SEI message. When the NNPFA SEI message and the NNPFC SEI message are located in the same picture unit (PU) and nnpfa_id equals nnpfc_id, the NNPFC SEI message should precede the NNPFA SEI message in the decoding order. At step 4204, a conversion between visual media data and a bitstream is performed based on the NNPFA SEI message and the NNPFC SEI message.

[0154] It should be noted that method 4200 can be implemented in an apparatus for processing video data, including a processor and a non-transitory memory having instructions thereon, such as a video encoder 4400, a video decoder 4500, and / or an encoder 4600. In this case, the instructions, when executed by the processor, cause the processor to perform method 4200. Furthermore, method 4200 can be executed by a non-transitory computer-readable medium comprising a computer program product for use by a video encoding / decoding device. The computer program product includes computer-executable instructions stored on the non-transitory computer-readable medium, causing the video encoding / decoding device to perform method 4200 when executed by a processor.

[0155] Figure 5 This is a block diagram illustrating an example video encoding / decoding system 4300 that can utilize the techniques disclosed herein. The video encoding / decoding system 4300 may include a source device 4310 and a target device 4320. The source device 4310 generates encoded video data; this source device may be referred to as a video encoding / decoding device. The target device 4320 can decode the encoded video data generated by the source device 4310; this target device may be referred to as a video decoding device.

[0156] Source device 4310 may include video source 4312, video encoder 4314, and input / output (I / O) interface 4316. Video source 4312 may include, for example, a video capture device, an interface for receiving video data from a video content provider, and / or a source of a computer graphics system for generating video data, or a combination of such sources. Video data may include one or more pictures. Video encoder 4314 encodes the video data from video source 4312 to generate a bitstream. The bitstream may include a series of bits that form a coded representation of the video data. The bitstream may include coded pictures and associated data. A coded picture is an encoded representation of a picture. Associated data may include sequence parameter sets, picture parameter sets, and other syntax structures. I / O interface 4316 may include a modulator / demodulator (modem) and / or a transmitter. Encoded video data may be transmitted directly to target device 4320 via network 4330 through I / O interface 4316. Encoded video data may also be stored on storage medium / server 4340 for access by target device 4320.

[0157] Target device 4320 may include I / O interface 4326, video decoder 4324, and display device 4322. I / O interface 4326 may include a receiver and / or a modem. I / O interface 4326 may acquire encoded video data from source device 4310 or storage medium / server 4340. Video decoder 4324 may decode the encoded video data. Display device 4322 may display the decoded video data to a user. Display device 4322 may be integrated with target device 4320 or may be external to target device 4320, which may be configured to interface with an external display device.

[0158] The video encoder 4314 and the video decoder 4324 can operate according to video compression standards such as the High Efficiency Video Codec (HEVC) standard, the Multi-Functional Video Codec (VVM) standard, and other current and / or additional standards.

[0159] Figure 6 This is a block diagram illustrating an example of a video encoder 4400, which can be... Figure 5 The system 4300 shown includes a video encoder 4314. The video encoder 4400 can be configured to perform any or all of the techniques described in this disclosure. The video encoder 4400 includes multiple functional components. The techniques described in this disclosure can be shared among the various components of the video encoder 4400. In some examples, a processor can be configured to perform any or all of the techniques described in this disclosure.

[0160] The functional components of the video encoder 4400 may include a segmentation unit 4401, a prediction unit 4402 (which may include a mode selection unit 4403, a motion estimation unit 4404, a motion compensation unit 4405, and an intra-frame prediction unit 4406), a residual generation unit 4407, a transform processing unit 4408, a quantization unit 4409, an inverse quantization unit 4410, an inverse transform unit 4411, a reconstruction unit 4412, a buffer 4413, and an entropy coding unit 4414.

[0161] In other examples, the video encoder 4400 may include more, fewer, or different functional components. In one example, the prediction unit 4402 may include an intra-block copy (IBC) unit. The IBC unit can perform prediction in IBC mode, where at least one reference picture is the picture in which the current video block is located.

[0162] In addition, some components (such as motion estimation unit 4404 and motion compensation unit 4405) may be highly integrated, but are shown separately in the example of video encoder 4400 for illustrative purposes.

[0163] The segmentation unit 4401 can segment an image into one or more video blocks. The video encoder 4400 and the video decoder 4500 can support various video block sizes.

[0164] The mode selection unit 4403 can, for example, select one of the encoding / decoding modes based on the error result, and provide the resulting intra-frame or inter-frame encoded / decoded block to the residual generation unit 4407 to generate residual block data, and provide it to the reconstruction unit 4412 to reconstruct the coded block for use as a reference picture. In some examples, the mode selection unit 4403 can select a combination of intra-frame and inter-frame prediction (CIIP) modes, where the prediction is based on the inter-frame prediction signal and the intra-frame prediction signal. The mode selection unit 4403 can also select the resolution of the motion vector for the block (e.g., sub-pixel or integer pixel precision) in the case of inter-frame prediction.

[0165] To perform inter-frame prediction on the current video block, motion estimation unit 4404 can generate motion information for the current video block by comparing one or more reference frames from buffer 4413 with the current video block. Motion compensation unit 4405 can determine the predicted video block for the current video block based on the motion information and decoded samples of the image from buffer 4413 (rather than the image associated with the current video block).

[0166] The motion estimation unit 4404 and the motion compensation unit 4405 can perform different operations on the current video block, depending on whether the current video block is in an I-band, P-band, or B-band.

[0167] In some examples, motion estimation unit 4404 can perform unidirectional prediction on the current video block, and can search for a reference video block for the current video block in the reference images of list 0 or list 1. Motion estimation unit 4404 can then generate a reference index indicating a reference image in list 0 or list 1, which includes a reference video block and a motion vector indicating the spatial displacement between the current video block and the reference video block. Motion estimation unit 4404 can output the reference index, prediction direction indicator, and motion vector as motion information for the current video block. Motion compensation unit 4405 can generate a predicted video block for the current block based on the reference video block indicated by the motion information of the current video block.

[0168] In other examples, motion estimation unit 4404 can perform bidirectional prediction on the current video block. Motion estimation unit 4404 can search for a reference video block for the current video block in the reference images in list 0, and can also search for another reference video block for the current video block in the reference images in list 1. Motion estimation unit 4404 can then generate a reference index indicating the reference images in lists 0 and 1, which include the reference video block and a motion vector indicating the spatial displacement between the reference video block and the current video block. Motion estimation unit 4404 can output the reference index and motion vector of the current video block as motion information for the current video block. Motion compensation unit 4405 can generate a predicted video block for the current video block based on the reference video block indicated by the motion information of the current video block.

[0169] In some examples, the motion estimation unit 4404 may output a complete set of motion information for the decoder's decoding process. In some examples, the motion estimation unit 4404 may not output a complete set of motion information for the current video. More precisely, the motion estimation unit 4404 may signal the motion information of the current video block by referencing the motion information of another video block. For example, the motion estimation unit 4404 may determine that the motion information of the current video block is sufficiently similar to the motion information of adjacent video blocks.

[0170] In one example, the motion estimation unit 4404 may indicate a value in the syntax structure associated with the current video block that indicates to the video decoder 4500 that the current video block has the same motion information as another video block.

[0171] In another example, motion estimation unit 4404 can identify another video block and motion vector difference (MVD) in the syntax structure associated with the current video block. The motion vector difference indicates the difference between the motion vector of the current video block and the motion vector of the indicated video block. Video decoder 4500 can use the motion vector of the indicated video block and the motion vector difference to determine the motion vector of the current video block.

[0172] As described above, the video encoder 4400 can predictively signal motion vectors. Two examples of predictive signaling techniques that can be implemented by the video encoder 4400 include Advanced Motion Vector Prediction (AMVP) and merged mode signaling.

[0173] Intra-prediction unit 4406 can perform intra-prediction on the current video block. When intra-prediction unit 4406 performs intra-prediction on the current video block, it can generate prediction data for the current video block based on the decoded samples of other video blocks in the same frame. The prediction data for the current video block may include the predicted video block and various syntax elements.

[0174] The residual generation unit 4407 can generate residual data for the current video block by subtracting the predicted video block of the current video block from the current video block. The residual data of the current video block may include residual video blocks corresponding to different sample components of the samples in the current video block.

[0175] In other examples, such as in skip mode, there may be no residual data for the current video block, and the residual generation unit 4407 may not perform subtraction operations.

[0176] The transform processing unit 4408 can generate one or more transform coefficient video blocks for the current video block by applying one or more transforms to the residual video block associated with the current video block.

[0177] After the transform processing unit 4408 generates a transform coefficient video block associated with the current video block, the quantization unit 4409 can quantize the transform coefficient video block associated with the current video block based on one or more quantization parameter (QP) values ​​associated with the current video block.

[0178] The inverse quantization unit 4410 and the inverse transform unit 4411 can apply inverse quantization and inverse transform to the transform coefficient video block, respectively, to reconstruct the residual video block from the transform coefficient video block. The reconstruction unit 4412 can add the reconstructed residual video block to the corresponding samples of one or more predicted video blocks generated by the prediction unit 4402 to produce a reconstructed video block associated with the current block, so as to be stored in the buffer 4413.

[0179] After the video block is reconstructed by reconstruction unit 4412, a loop filtering operation can be performed to reduce video block artifacts in the video block.

[0180] The entropy encoding unit 4414 can receive data from other functional components of the video encoder 4400. When the entropy encoding unit 4414 receives data, it can perform one or more entropy encoding operations to generate entropy-encoded data and output a bit stream including the entropy-encoded data.

[0181] Figure 7 This is a block diagram illustrating an example of a video decoder 4500, which can be... Figure 5 The system 4300 shown includes a video decoder 4324. The video decoder 4500 can be configured to perform any or all of the techniques described in this disclosure. In the example shown, the video decoder 4500 includes multiple functional components. The techniques described in this disclosure can be shared among the various components of the video decoder 4500. In some examples, the processor can be configured to perform any or all of the techniques described in this disclosure.

[0182] In the example shown, the video decoder 4500 includes an entropy decoding unit 4501, a motion compensation unit 4502, an intra-frame prediction unit 4503, an inverse quantization unit 4504, an inverse transform unit 4505, a reconstruction unit 4506, and a buffer 4507. In some examples, the video decoder 4500 can perform a decoding channel, which typically corresponds to the encoding channel described with respect to the video encoder 4400.

[0183] The entropy decoding unit 4501 can retrieve the encoded bitstream. The encoded bitstream may include entropy-coded video data (e.g., encoded video data blocks). The entropy decoding unit 4501 can decode the entropy-coded video data, and the motion compensation unit 4502 can determine motion information based on the entropy-coded video data. This motion information includes motion vectors, motion vector precision, reference image list index, and other motion information. The motion compensation unit 4502 can determine this information, for example, by performing AMVP and merging modes.

[0184] The motion compensation unit 4502 can generate motion compensation blocks, thereby potentially performing interpolation based on an interpolation filter. The identifier of the interpolation filter used at sub-pixel precision can be included in the syntax element.

[0185] The motion compensation unit 4502 can use the interpolation filter used by the video encoder 4400 during the encoding of the video block to calculate the sub-integer pixel interpolation of the reference block. The motion compensation unit 4502 can determine the interpolation filter used by the video encoder 4400 based on the received syntax information, and use the interpolation filter to generate the prediction block.

[0186] The motion compensation unit 4502 may use some syntax information to determine the size of the blocks used to encode the frames and / or stripes of the encoded video sequence, segmentation information describing how each macroblock of the picture of the encoded video sequence is segmented, a mode indicating how each segment is encoded, one or more reference frames (and a list of reference frames) for each inter-frame codec block, and other information used to decode the encoded video sequence.

[0187] Intra-prediction unit 4503 can use, for example, an intra-prediction mode received in the bitstream to form prediction blocks from spatially adjacent blocks. Inverse quantization unit 4504 performs inverse quantization, i.e., dequantization, on the video block coefficients provided in the bitstream and decoded and quantized by entropy decoding unit 4501. Inverse transform unit 4505 applies the inverse transform.

[0188] The reconstruction unit 4506 can add the residual block to the corresponding prediction block generated by the motion compensation unit 4502 or the intra-frame prediction unit 4503 to form a decoded block. If necessary, a deblocking filter can also be applied to filter the decoded block to eliminate block artifacts. The decoded video block is then stored in a buffer 4507, which provides a reference block for subsequent motion compensation / intra-frame prediction and also generates decoded video for presentation on a display device.

[0189] Figure 8 This is a schematic diagram of an example encoder 4600. Encoder 4600 is suitable for implementing VVC technology. Encoder 4600 includes three loop filters: a deblocking filter (DF) 4602, a sample adaptive offset (SAO) 4604, and an adaptive loop filter (ALF) 4606. Unlike DF 4602, which uses predefined filters, SAO 4604 and ALF 4606 utilize the original samples of the current image to reduce the mean square error between the original and reconstructed samples by adding an offset and applying a finite impulse response (FIR) filter, respectively, and by using encoder-decoder side information signaling to inform the offset and filter coefficients. ALF 4606 is located in the final processing stage of each image and can be considered as a tool to attempt to capture and repair artifacts created by previous stages.

[0190] The encoder 4600 also includes an intra-frame prediction component 4608 and a motion estimation / compensation (ME / MC) component 4610 configured to receive input video. The intra-frame prediction component 4608 is configured to perform intra-frame prediction, while the ME / MC component 4610 is configured to perform inter-frame prediction using a reference image obtained from a reference image buffer 4612. Residual blocks from inter-frame or intra-frame prediction are fed into a transform (T) component 4614 and a quantization (Q) component 4616 to generate quantized residual transform coefficients, which are then fed into an entropy codec component 4618. The entropy codec component 4618 entropy-encodes the prediction results and the quantized transform coefficients and sends them to a video decoder (not shown). Quantized components output from the quantization component 4616 can be fed into an inverse quantization (IQ) component 4620, an inverse transform component 4622, and a reconstruction (REC) component 4624. REC component 4624 can output images to DF 4602, SAO 4604 and ALF 4606 for filtering before these images are stored in reference image buffer 4612.

[0191] Figure 9This is a flowchart of an example method 4700 for video processing. Method 4700 includes performing a conversion between visual media data and a bitstream based on a rule at step 4702. The rule specifies that an NNPFA Supplemental Enhancement Information (SEI) message with a first specific value of a Neural Network Post-Processing Filter Activation (NNPFA) identifier exists only in the current picture unit (PU) when one or both of the following conditions are met: a) the currently encoded / decoded layer video sequence (CLVS) includes a Neural Network Post-Processing Filter Feature (NNPFC) SEI message from the previous PU preceding the current PU in decoding order, whose NNPFC identifier (nnpfc_id) is equal to the first specific value of the NNPFA identifier; or b) the current PU includes an NNPFC SEI message whose nnpfc_id is equal to the first specific value of the NNPFA identifier. According to the example, the conversion at step 4702 can be performed at the encoder or at the decoder.

[0192] It should be noted that method 4700 can be implemented in an apparatus for processing video data, including a processor and a non-transitory memory having instructions thereon, such as a video encoder 4400, a video decoder 4500, and / or an encoder 4600. In this case, the instructions, when executed by the processor, cause the processor to perform method 4200. Furthermore, method 4700 can be executed by a non-transitory computer-readable medium comprising a computer program product for use by a video encoding / decoding device. The computer program product includes computer-executable instructions stored on the non-transitory computer-readable medium, causing the video encoding / decoding device to perform method 4700 when executed by a processor.

[0193] The following is a list of preferred solutions based on some examples.

[0194] The following solutions illustrate examples of the techniques discussed in this article.

[0195] 1. A method for processing media data, comprising: determining the value of a second NNPFA identifier (nnpfa_id) in a Neural Network Post-Processing Filter Activation (NNPFA) Supplemental Enhancement Information (SEI) message associated with a current picture unit (PU), wherein the NNPFA SEI message exists in the bitstream only if: the currently encoded / decoded layer video sequence (CLVS) includes a Neural Network Post-Processing Filter Feature (NNPFC) SEI message in a previous PU preceding the current PU in decoding order, the NNPFC identifier (nnpfc_id) of which is equal to a first nnpfa_id; or the current PU includes an NNPFC SEI message; and performing a conversion between visual media data and the bitstream based on the NNPFC SEI message.

[0196] 2. A method for processing media data, comprising: determining a Neural Network Post-Processing Filter Activation (NNPFA) identifier (nnpfa_id) in an NNPFA Supplemental Enhancement Information (SEI) message and a Neural Network Post-Processing Filter Feature (NNPFC) identifier (nnpfc_id) in an NNPFC SEI message, wherein when the NNPFA SEI message and the NNPFC SEI message are located in the same picture unit (PU) and nnpfa_id equals nnpfc_id, the NNPFC SEI message should be located before the NNPFA SEI message in the decoding order; and performing a conversion between visual media data and a bitstream based on the NNPFA SEI message and the NNPFC SEI message.

[0197] 3. An apparatus for processing video data, comprising: a processor; and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to perform the method as described in any one of solutions 1 to 2.

[0198] 4. A non-transitory computer-readable medium comprising a computer program product for use by a video codec apparatus, the computer program product comprising computer-executable instructions stored on the non-transitory computer-readable medium, causing the video codec apparatus to perform the method as described in any one of solutions 1 to 2 when executed by a processor.

[0199] 5. A non-transitory computer-readable recording medium storing a bitstream of video generated by a method performed by a video processing apparatus, wherein the method includes: determining a value of a second NNPFA identifier (nnpfa_id) in a Neural Network Post-Processing Filter Activation (NNPFA) Supplemental Enhancement Information (SEI) message associated with a current picture unit (PU), wherein the NNPFA SEI message exists in the bitstream only if:

[0200] The currently encoded / decoded layer video sequence (CLVS) includes Neural Network Post-Processing Filter Feature (NNPFC) SEI messages from the previous PU preceding the current PU in the decoding order, with its NNPFC identifier (nnpfc_id) equal to the first nnpfa_id; or the current PU includes NNPFC SEI messages; and a bitstream is generated based on the determination.

[0201] 6. A method for storing a bitstream of video, comprising: determining a value of a second NNPFA identifier (nnpfa_id) in a Neural Network Post-Processing Filter Activation (NNPFA) Supplemental Enhancement Information (SEI) message associated with a current Picture Unit (PU), wherein the NNPFA SEI message exists in the bitstream only if: the currently encoded / decoded Layer Video Sequence (CLVS) includes a Neural Network Post-Processing Filter Feature (NNPFC) SEI message in a previous PU preceding the current PU in decoding order, the NNPFC identifier (nnpfc_id) of which is equal to a first nnpfa_id; or the current PU includes an NNPFC SEI message; and generating a bitstream based on the determination; and storing the bitstream in a non-transitory computer-readable recording medium.

[0202] 7. A method, apparatus or system described in this patent document.

[0203] The following solutions illustrate further examples of the techniques discussed in this article.

[0204] 1. A method for processing media data, comprising: performing a conversion between visual media data and a bitstream based on a rule; wherein the rule specifies that an NNPFA Supplemental Enhancement Information (SEI) message having a first specific value of a Neural Network Post-Processing Filter Activation (NNPFA) identifier exists only in the current picture unit (PU) when one or both of the following conditions are met: a) the currently encoded / decoded layer video sequence (CLVS) includes a Neural Network Post-Processing Filter Feature (NNPFC) SEI message from a previous PU preceding the current PU in decoding order, the NNPFC identifier (nnpfc_id) being equal to the first specific value of the NNPFA identifier; or b) the current PU includes an NNPFC SEI message whose nnpfc_id is equal to the first specific value of the NNPFA identifier.

[0205] 2. The method as described in Solution 1, wherein the rule further specifies that when the PU includes both an NNPFC SEI message with a second specific value of nnpfc_id and an NNPFA SEI message with an NNPFA identifier equal to the second specific value of nnpfc_id, the NNPFC SEI message should precede the NNPFA SEI message in the decoding order.

[0206] 3. The method of any one of solutions 1 to 2, wherein the NNPFA SEI message specifies a neural network post-processing filter (NNPF) for post-processing filtering of the current image.

[0207] 4. The method as described in any one of solutions 1 to 3, wherein the NNPFA SEI message is only maintained for the current image.

[0208] 5. The method as described in any one of solutions 1 to 4, wherein multiple NNPFA SEI messages exist for the same image.

[0209] 6. The method as described in any one of solutions 1 to 5, wherein multiple NNPFA SEI messages exist for the same image when multiple post-processing filters are used for different purposes.

[0210] 7. The method of any one of solutions 1 to 6, wherein multiple NNPFA SEI messages exist for the same image when multiple post-processing filters are used to filter different color components.

[0211] 8. The method of any one of solutions 1 to 7, wherein the NNPFC SEI message specifies the neural network used as a post-processing filter.

[0212] 9. The method of any one of solutions 1 to 8, wherein the transformation includes encoding visual media data into a bitstream.

[0213] 10. The method of any one of solutions 1 to 8, wherein the conversion includes decoding visual media data from a bitstream.

[0214] 11. A non-transitory computer-readable recording medium storing a bitstream of video generated by a method performed by a video processing apparatus, wherein the method includes: performing a conversion between visual media data and the bitstream based on a rule; wherein the rule specifies that an NNPFA Supplemental Enhancement Information (SEI) message having a first specific value of a Neural Network Post-Processing Filter Activation (NNPFA) identifier exists only in the current picture unit (PU) when one or both of the following conditions are met: a) the currently encoded / decoded layer video sequence (CLVS) includes a Neural Network Post-Processing Filter Feature (NNPFC) SEI message from a previous PU preceding the current PU in decoding order, the NNPFC identifier (nnpfc_id) being equal to the first specific value of the NNPFA identifier; or b) the current PU includes an NNPFC SEI message whose nnpfc_id is equal to the first specific value of the NNPFA identifier.

[0215] 12. The non-transitory computer-readable recording medium as described in Solution 11, wherein the rule further specifies that when the PU includes both an NNPFC SEI message having a second specific value of nnpfc_id and an NNPFA SEI message having an NNPFA identifier equal to the second specific value of nnpfc_id, the NNPFC SEI message shall precede the NNPFA SEI message in the decoding order.

[0216] 13. A non-transitory computer-readable recording medium as described in any one of solutions 11 to 12, wherein the NNPFASEI message specifies a neural network post-processing filter (NNPF) for post-processing filtering of the current image.

[0217] 14. A non-transitory computer-readable recording medium as described in any one of solutions 11 to 13, wherein the NNPFASEI message is retained only for the current image.

[0218] 15. A non-transitory computer-readable recording medium as described in any one of solutions 11 to 14, wherein multiple NNPFA SEI messages exist for the same picture.

[0219] 16. A non-transitory computer-readable recording medium as described in any one of solutions 11 to 15, wherein multiple NNPFA SEI messages exist for the same picture when multiple post-processing filters are used for different purposes.

[0220] 17. A non-transitory computer-readable recording medium as described in any one of solutions 11 to 16, wherein multiple NNPFA SEI messages exist for the same image when multiple post-processing filters are used to filter different color components.

[0221] 18. A non-transitory computer-readable recording medium as described in any one of solutions 11 to 17, wherein the NNPFCSEI message specifies a neural network used as a post-processing filter.

[0222] 19. A method for storing a bitstream of video, comprising: determining a current value of an NNPFA identifier in a Neural Network Post-Processing Filter Activation (NNPFA) Supplemental Enhancement Information (SEI) message in a current picture unit (PU), wherein the NNPFA SEI message exists in the current PU only if: the currently encoded / decoded layer video sequence (CLVS) includes a Neural Network Post-Processing Filter Feature (NNPFC) SEI message of a previous PU preceding the current PU in decoding order, the NNPFC identifier (nnpfc_id) of which is equal to the previous NNPFA identifier; or the current PU includes an NNPFC SEI message in which nnpfc_id is equal to the current value of the NNPFA identifier; generating a bitstream based on the determination; and storing the bitstream in a non-transitory computer-readable recording medium.

[0223] 20. The method as described in Solution 19, wherein when the PU includes both an NNPFC SEI message with a specific value of nnpfc_id and an NNPFA SEI message with an NNPFA identifier equal to the specific value of nnpfc_id, the NNPFC SEI message should precede the NNPFA SEI message in the decoding order.

[0224] 21. The method of any one of solutions 19 to 20, wherein the NNPFA SEI message specifies a neural network post-processing filter (NNPF) for post-processing filtering of the current image.

[0225] 22. The method of any one of solutions 19 to 21, wherein the NNPFA SEI message is maintained only for the current image.

[0226] 23. An apparatus for processing video data, comprising: a processor; and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to perform the method as described in any one of solutions 1 to 10.

[0227] 24. A non-transitory computer-readable medium comprising a computer program product for use by a video codec apparatus, the computer program product comprising computer-executable instructions stored on the non-transitory computer-readable medium, causing the video codec apparatus to perform the method as described in any one of solutions 1 to 10 when executed by a processor.

[0228] In the solution described in this paper, the encoder conforms to the format rules by generating a codec representation based on those rules. In the solution described in this paper, the decoder uses the format rules to parse the syntax elements in the codec representation, and, knowing the presence or absence of these syntax elements based on the format rules, generates the decoded video.

[0229] In this document, the term "video processing" can refer to video encoding, video decoding, video compression, or video decompression. For example, a video compression algorithm may be applied during the conversion from a pixel representation of a video to a corresponding bitstream representation (or vice versa). For example, the bitstream representation of the current video block may correspond to bits at the same position or at different positions in the bitstream, as defined by the syntax. For example, macroblocks may be encoded based on the error residuals from the transform and encoding / decoding, and also using bits in the header and other fields in the bitstream. Furthermore, during the conversion, the decoder may resolve the bitstream based on determinations and know that certain fields may or may not be present, as described in the solutions above. Similarly, the encoder may determine whether to include certain syntax fields and generate the codec representation accordingly by including or excluding syntax fields in the codec representation.

[0230] The disclosed and other solutions, examples, embodiments, modules, and functional operations described in this document can be implemented in digital electronic circuits or computer software, firmware, or hardware, or a combination thereof, including the structures disclosed in this document and their equivalents. The disclosed and other embodiments can be implemented as one or more computer program products encoded on a computer-readable medium for execution by or control of the operation of a data processing apparatus; that is, one or more modules of computer program instructions. The computer-readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of substances influencing machine-readable propagation signals, or a combination thereof. The term "data processing apparatus" encompasses all means, devices, and machines for processing data, including, for example, a programmable processor, a computer, or multiple processors or computers. In addition to hardware, the apparatus may also include code that creates an execution environment for the computer program in question, such as code constituting processor firmware, a protocol stack, a database management system, an operating system, or a combination thereof. Propagation signals are artificially generated signals, such as machine-generated electrical, optical, or electromagnetic signals, which are generated to encode information for transmission to a suitable receiver device.

[0231] Computer programs (also referred to as programs, software, software applications, scripts, or code) can be written in any form of programming language, including compiled or interpreted languages, and can be deployed in any form, including as standalone programs or as modules, components, subroutines, or any other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored as a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), a single file dedicated to the program in question, or multiple coordinated files (e.g., a file storing portions of one or more modules, subroutines, or code). Computer programs can be deployed to execute on one or more computers located at a single site or distributed across multiple sites and interconnected by a communications network.

[0232] The processes or logical flows described in this document can be executed by one or more programmable processors that execute one or more computer programs to perform functions by manipulating input data and generating outputs. The processes and logical flows can also be executed by special-purpose logic circuitry (e.g., field-programmable gate arrays (FPGAs) or application-specific integrated circuits (ASICs)), and the devices can also be implemented as special-purpose logic circuitry (e.g., field-programmable gate arrays (FPGAs) or application-specific integrated circuits (ASICs)).

[0233] Processors suitable for executing computer programs include, for example, general-purpose and special-purpose microprocessors, and one or more processors of any type of digital computer. Typically, the processor receives instructions and data from read-only memory or random access memory, or both. The basic elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Typically, a computer will also include one or more mass storage devices (e.g., magnetic disks, magneto-optical disks, or optical disks) for storing data, or operatively coupled to receive data from or transfer data to one or more mass storage devices, or both. However, a computer need not have such devices. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, such as semiconductor memory devices, e.g., erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and optical disc read-only memory (CD ROM) and digital versatile optical disc read-only memory (DVD-ROM). The processor and memory may be supplemented or integrated therein by dedicated logic circuitry.

[0234] Although this patent document includes numerous details, these details should not be construed as limiting any subject matter or potentially claimed content, but rather as descriptions of features that may be specific to particular embodiments of a particular technology. Certain features described in this patent document within the context of individual embodiments may also be implemented in combination in a single embodiment. Conversely, various features described within the context of a single embodiment may also be implemented individually or in any suitable sub-combination in multiple embodiments. Furthermore, although features may be described above as functioning in certain combinations and even initially claimed in this way, in some cases, one or more features from a claimed combination may be removed from that combination, and the claimed combination may involve sub-combinations or variations thereof.

[0235] Similarly, although operations are shown in a specific order in the figures, this should not be construed as requiring such operations to be performed in the shown specific order or sequential order, or performing all shown operations to achieve the desired result. Furthermore, the separation of the various system components described in this patent document should not be construed as requiring such separation in all embodiments.

[0236] Only a few implementations and examples are described, and other implementations, enhancements and modifications can be made based on what is described and shown in this patent document.

[0237] When no intermediate component exists other than a line, trace, or other medium between the first and second components, the first component is directly coupled to the second component. When an intermediate component other than a line, trace, or other medium exists between the first and second components, the first component is indirectly coupled to the second component. The term "coupled" and its variations include direct coupling and indirect coupling. Unless otherwise stated, the use of the term "about" means including a range of ±10% of the subsequent value.

[0238] While several embodiments are provided in this disclosure, it should be understood that the disclosed systems and methods may be embodied in many other specific forms without departing from the spirit or scope of this disclosure. These examples are intended to be illustrative rather than restrictive and are not intended to be limited to the details given herein. For example, various elements or components may be combined or integrated into another system, or certain features may be omitted or not implemented.

[0239] Furthermore, without departing from the scope of this disclosure, the technologies, systems, subsystems, and methods described and illustrated as discrete or separate in the various embodiments may be combined or integrated with other systems, modules, technologies, or methods. Other items shown or discussed as coupled may be directly connected or indirectly coupled or communicated through some interface, device, or intermediate component, whether electrical, mechanical, or otherwise. Other examples of changes, substitutions, and modifications will be apparent to those skilled in the art, and such changes, substitutions, and modifications may be made without departing from the spirit and scope of this disclosure.

Claims

1. A method for processing visual media data, comprising: The conversion between visual media data and the bitstream of the visual media data is performed based on rules; The rule states that NNPFA supplemental enhancement information (SEI) messages with a first specific value of the NNPFA identifier for neural network post-processing filter activation do not exist in the current picture unit (PU) unless one or both of the following conditions are met: a) The currently encoded / decoded layer video sequence CLVS includes Neural Network Post-Processing Filter Feature (NNPFC) SEI messages in PUs preceding the current PU in decoding order, wherein the NNPFC SEI messages have an NNPFC identifier equal to the first specific value of the NNPFA identifier; or b) The current PU includes an NNPFC SEI message with an NNPFC identifier having a first specific value equal to the NNPFA identifier.

2. The method as described in claim 1, wherein, The rule further specifies that when a PU includes both an NNPFC SEI message with a second specific value of an NNPFC identifier and an NNPFA SEI message with an NNPFA identifier having the second specific value equal to the NNPFC identifier, the NNPFC SEI message should precede the NNPFA SEI message in the decoding order.

3. The method as described in claim 1, wherein, The NNPFA SEI message activates or deactivates the target neural network post-processing filter NNPF for possible use in post-processing filtering of a set of images.

4. The method of claim 1, wherein, The rule further stipulates that the NNPFA SEI message is only retained for the current image.

5. The method of claim 1, wherein, The rule further stipulates that multiple NNPFA SEI messages are allowed to exist for the same image.

6. The method of claim 1, wherein, The rule further stipulates that when multiple neural network post-processing filters are used for different purposes, multiple NNPFA SEI messages are allowed to exist for the same image.

7. The method of claim 1, wherein, The rule further stipulates that when multiple neural network post-processing filters are used to filter different color components, multiple NNPFA SEI messages are allowed to exist for the same image.

8. The method of claim 1, wherein, The NNPFC SEI message specifies the neural network that can be used as a post-processing filter.

9. The method of claim 1, wherein, The conversion includes encoding the visual media data into the bitstream.

10. The method of claim 1, wherein, The conversion includes decoding the visual media data from the bitstream.

11. The method of claim 1, wherein, The rule further specifies that the use of a prescribed neural network post-processing filter for a particular image is indicated by an NNPFA SEI message.

12. A non-transitory computer-readable recording medium having stored thereon instructions and a bitstream of visual media data, the instructions causing the processor to perform a method when executed by the processor, wherein... The method includes: Generate the bitstream of the visual media data based on the rules; The rule states that NNPFA supplemental enhancement information (SEI) messages with a first specific value of the NNPFA identifier for neural network post-processing filter activation do not exist in the current picture unit (PU) unless one or both of the following conditions are met: a) The currently encoded / decoded layer video sequence CLVS includes Neural Network Post-Processing Filter Feature (NNPFC) SEI messages in PUs preceding the current PU in decoding order, wherein the NNPFC SEI messages have an NNPFC identifier equal to the first specific value of the NNPFA identifier; or b) The current PU includes an NNPFC SEI message with an NNPFC identifier having a first specific value equal to the NNPFA identifier.

13. The non-transitory computer-readable recording medium of claim 12, wherein, The rule further specifies that when a PU includes both an NNPFC SEI message with a second specific value of an NNPFC identifier and an NNPFA SEI message with an NNPFA identifier having the second specific value equal to the NNPFC identifier, the NNPFC SEI message should precede the NNPFA SEI message in the decoding order.

14. The non-transitory computer-readable recording medium of claim 12, wherein, The NNPFA SEI message can be used to activate or deactivate the target neural network post-processing filter NNPF for post-processing filtering a set of images.

15. The non-transitory computer-readable recording medium of claim 12, wherein, The rule further stipulates that the NNPFA SEI message is only retained for the current image.

16. The non-transitory computer-readable recording medium of claim 12, wherein, The rule further stipulates that multiple NNPFA SEI messages are allowed to exist for the same image.

17. The non-transitory computer-readable recording medium of claim 12, wherein, The rule further stipulates that when multiple neural network post-processing filters are used for different purposes, multiple NNPFA SEI messages are allowed to exist for the same image.

18. The non-transitory computer-readable recording medium of claim 12, wherein, The rule further stipulates that when multiple neural network post-processing filters are used to filter different color components, multiple NNPFA SEI messages are allowed to exist for the same image.

19. The non-transitory computer-readable recording medium of claim 12, wherein, The NNPFC SEI message specifies the neural network that can be used as a post-processing filter.

20. A method for storing a bitstream of visual media data, comprising: Generate the bitstream of the visual media data based on the rules; The rule states that NNPFA supplemental enhancement information (SEI) messages with a first specific value of the NNPFA identifier for neural network post-processing filter activation do not exist in the current picture unit (PU) unless one or both of the following conditions are met: a) The currently encoded / decoded layer video sequence CLVS includes Neural Network Post-Processing Filter Feature (NNPFC) SEI messages in PUs preceding the current PU in decoding order, wherein the NNPFC SEI messages have an NNPFC identifier equal to the first specific value of the NNPFA identifier; or b) The current PU includes an NNPFC SEI message with an NNPFC identifier having a first specific value equal to the NNPFA identifier; and The bit stream is stored in a non-transitory computer-readable recording medium.

21. The method of claim 20, wherein, When the PU includes both an NNPFC SEI message with a second specific value of an NNPFC identifier and an NNPFA SEI message with an NNPFA identifier having the same second specific value as the NNPFC identifier, the NNPFC SEI message should precede the NNPFA SEI message in the decoding order.

22. The method of claim 20, wherein, The NNPFA SEI message activates or deactivates the target neural network post-processing filter NNPF for possible use in post-processing filtering of a set of images.

23. The method of claim 20, wherein, The rule further stipulates that the NNPFA SEI message is only retained for the current image.

24. An apparatus for processing visual media data, comprising: processor; and a non-transitory memory thereon having instructions, wherein, when executed by the processor, the instructions cause the processor to: The conversion between visual media data and the bitstream of the visual media data is performed based on rules; The rule states that NNPFA supplemental enhancement information (SEI) messages with a first specific value of the NNPFA identifier for neural network post-processing filter activation do not exist in the current picture unit (PU) unless one or both of the following conditions are met: a) The currently encoded / decoded layer video sequence CLVS includes Neural Network Post-Processing Filter Feature (NNPFC) SEI messages in PUs preceding the current PU in decoding order, wherein the NNPFC SEI messages have an NNPFC identifier equal to the first specific value of the NNPFA identifier; or b) The current PU includes an NNPFC SEI message with an NNPFC identifier having a first specific value equal to the NNPFA identifier.

25. The apparatus of claim 24, wherein, The rule further specifies that when a PU includes both an NNPFC SEI message with a second specific value of an NNPFC identifier and an NNPFA SEI message with an NNPFA identifier having the second specific value equal to the NNPFC identifier, the NNPFC SEI message should precede the NNPFA SEI message in the decoding order.

26. The apparatus of claim 24, wherein, The rule further stipulates that multiple NNPFA SEI messages are allowed to exist for the same image.

27. The apparatus of claim 24, wherein, The rule further stipulates that when multiple neural network post-processing filters are used for different purposes, multiple NNPFA SEI messages are allowed to exist for the same image.

28. The apparatus of claim 24, wherein, The rule further stipulates that when multiple neural network post-processing filters are used to filter different color components, multiple NNPFA SEI messages are allowed to exist for the same image.

29. A non-transitory computer-readable storage medium having instructions stored thereon, the instructions causing a processor to: The conversion between visual media data and the bitstream of the visual media data is performed based on rules; in, The rule states that NNPFA Supplemental Enhancement Information (SEI) messages with a first specific value of the NNPFA identifier for neural network post-processing filter activation are not present in the current picture unit (PU) unless one or both of the following conditions are met: a) The currently encoded / decoded layer video sequence CLVS includes Neural Network Post-Processing Filter Feature (NNPFC) SEI messages in PUs preceding the current PU in decoding order, wherein the NNPFC SEI messages have an NNPFC identifier equal to the first specific value of the NNPFA identifier; or b) The current PU includes an NNPFC SEI message with an NNPFC identifier having a first specific value equal to the NNPFA identifier.

30. The non-transitory computer-readable storage medium of claim 29, wherein, The rule further specifies that when a PU includes both an NNPFC SEI message with a second specific value of an NNPFC identifier and an NNPFA SEI message with an NNPFA identifier having the second specific value equal to the NNPFC identifier, the NNPFC SEI message should precede the NNPFA SEI message in the decoding order.

31. The non-transitory computer-readable storage medium of claim 29, wherein, The rule further stipulates that multiple NNPFA SEI messages are allowed to exist for the same image.