Image processing method, processing device and storage medium

By using multiple input branches and neural networks or lookup tables to filter image blocks, the problem of high computational complexity in high-efficiency video coding is solved, achieving more efficient video coding and decoding.

CN120455665BActive Publication Date: 2026-06-12SHENZHEN TRANSSION HLDG CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHENZHEN TRANSSION HLDG CO LTD
Filing Date
2025-06-17
Publication Date
2026-06-12

Smart Images

  • Figure CN120455665B_ABST
    Figure CN120455665B_ABST
Patent Text Reader

Abstract

The application provides an image processing method, a processing device and a storage medium. The image processing method comprises: performing filtering processing on at least one image block according to a plurality of input branches, a neural network and / or a lookup table. Through the technical scheme, the complexity of filtering processing can be reduced, and the efficiency of video encoding and / or decoding can be improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of image processing technology, specifically to an image processing method, processing device, and storage medium. Background Technology

[0002] Existing high-efficiency video coding frameworks, such as Neural Network Based Video Coding (NNVC) and / or Enhanced Compression Model (ECM), propose a video frame coding technique to improve coding performance without significantly increasing computational complexity.

[0003] In conceiving and implementing this application, the inventors discovered at least the following problems: In the transformation stage of neural network loop filtering during the encoding and decoding process, the diversity of image block features leads to an increase in the number of convolution operations of the neural network, resulting in increased computational complexity, which in turn limits the efficiency of video encoding and / or decoding.

[0004] The preceding description is intended to provide general background information and does not necessarily constitute prior art. Summary of the Invention

[0005] To address the aforementioned technical problems, this application provides an image processing method, processing device, and storage medium, aiming to solve the technical problem of how to reduce the complexity of filtering processing, thereby supporting the improvement of video encoding and / or decoding efficiency.

[0006] This application provides an image processing method, applicable to a processing device, comprising the following steps:

[0007] S1, filtering at least one image patch based on multiple input branches, a neural network, and / or a lookup table.

[0008] Optionally, the image patch features include at least one of the following: the reconstruction transform features of the reconstructed block of at least one image patch, the prediction transform features of the predicted block of at least one image patch, and the derivative transform features of the derived block of at least one image patch.

[0009] Optionally, step S1 includes the following steps:

[0010] S11, perform channel stitching based on multiple input branches and at least one image patch feature to determine or obtain input features;

[0011] S12, filter at least one input feature based on a neural network and / or lookup table.

[0012] Optionally, multiple input branches include at least one of the following:

[0013] The first input branch is used to process the luminance component;

[0014] The second input branch is used to process the chromaticity components;

[0015] The third input branch is used for mixing the luminance and chrominance components.

[0016] Optionally, the first input branch includes at least one of the following: a first sub-input branch for processing the reconstructed transform features and derived transform features on the luminance component; a second sub-input branch for processing the predicted transform features and derived transform features on the luminance component; a third sub-input branch for processing the reconstructed transform features on the luminance component; a fourth sub-input branch for processing the predicted transform features on the luminance component; and a fifth sub-input branch for processing the derived transform features on the luminance component.

[0017] Optionally, the second input branch includes at least one of the following: a sixth sub-input branch for processing the reconstructed transform features and derived transform features on the chroma components; a seventh sub-input branch for processing the predicted transform features and derived transform features on the chroma components; an eighth sub-input branch for processing the reconstructed transform features on the chroma components; a ninth sub-input branch for processing the predicted transform features on the chroma components; a tenth sub-input branch for processing the derived transform features on the chroma components; a fourth input branch for processing the U component; and a fifth input branch for processing the V component.

[0018] Optionally, channel stitching is performed based on multiple input branches and features of at least one image patch, including at least one of the following:

[0019] Based on the first input branch, channel splicing is performed on the predicted transformation features of at least one luminance component, the reconstructed transformation features of at least one luminance component, and the derived transformation features of at least one luminance component.

[0020] Based on the second input branch, channel splicing is performed on the predicted transformation features of at least one chromaticity component, the reconstructed transformation features of at least one chromaticity component, and the derived transformation features of at least one luminance component.

[0021] Based on the third input branch, at least one of the following is channel-stitched: the predicted transformation feature of at least one luminance component and / or chrominance component, the derived transformation feature of at least one luminance component and / or chrominance component, and the reconstructed transformation feature of at least one luminance component and / or chrominance component.

[0022] Channel splicing is performed based on the reconstructed transformation features of at least one luminance component and the derived transformation features of at least one luminance component from the first sub-input branch.

[0023] Channel splicing is performed based on the predicted transform features of at least one luminance component and the derived transform features of at least one luminance component from the second sub-input branch.

[0024] Channel splicing is performed based on the reconstruction transformation features of at least one luminance component according to the third sub-input branch;

[0025] Channel splicing is performed based on the predicted transform features of at least one luminance component according to the fourth sub-input branch;

[0026] Channel splicing is performed based on the derived transform features of at least one luminance component according to the fifth sub-input branch;

[0027] Channel splicing is performed based on the reconstruction transformation features of at least one chromaticity component and the derived transformation features of at least one chromaticity component from the sixth sub-input branch.

[0028] Channel splicing is performed based on the predicted transform features of at least one chromaticity component and the derived transform features of at least one chromaticity component from the seventh sub-input branch.

[0029] Channel splicing is performed based on the reconstruction transformation features of at least one chromaticity component according to the eighth sub-input branch;

[0030] Channel splicing is performed based on the predicted transform characteristics of at least one chromaticity component according to the ninth sub-input branch;

[0031] Channel splicing is performed based on the derived transform features of at least one chromaticity component according to the tenth sub-input branch;

[0032] Based on the fourth input branch, channel splicing is performed on the predicted transformation features of at least one U component, the reconstructed transformation features of at least one U component, and the derived transformation features of at least one U component.

[0033] Based on the fifth input branch, channel splicing is performed on the predicted transformation features of at least one V component, the reconstructed transformation features of at least one V component, and the derived transformation features of at least one V component.

[0034] Optionally, the neural network includes a grouped convolutional module for dividing at least one input feature into at least one group for convolutional processing.

[0035] Optionally, at least one group includes at least one of the following:

[0036] At least one set of features corresponding to the first input branch in at least one input feature;

[0037] At least one set of features corresponding to the second input branch in at least one input feature;

[0038] At least one set of features corresponding to the third input branch in at least one input feature;

[0039] At least one set of features used to process the features corresponding to the first sub-input branch of at least one input feature;

[0040] At least one set of features used to process the features corresponding to the second sub-input branch of at least one input feature;

[0041] At least one set of features used to process the features corresponding to the third sub-input branch of at least one input feature;

[0042] At least one set of features used to process the features corresponding to the fourth sub-input branch of at least one input feature;

[0043] At least one set of features corresponding to the fifth sub-input branch in at least one input feature;

[0044] At least one set of features corresponding to the sixth sub-input branch in at least one input feature;

[0045] At least one set of features used to process the features corresponding to the seventh sub-input branch of at least one input feature;

[0046] At least one set of features used to process the features corresponding to the eighth sub-input branch of at least one input feature;

[0047] At least one set of features corresponding to the ninth sub-input branch in at least one input feature;

[0048] At least one set of features used to process the features corresponding to the tenth sub-input branch of at least one input feature;

[0049] At least one set of features used to process the features corresponding to the fourth input branch of at least one input feature;

[0050] Used to process at least one set of features corresponding to the fifth input branch of at least one input feature;

[0051] At least one set of at least one channel from a plurality of channels for each input branch.

[0052] Optionally, the input branch is at least one of the multiple input branches corresponding to at least one input feature.

[0053] Optionally, the image processing method further includes at least one of the following:

[0054] The grouped convolutional module includes at least one convolutional module corresponding to a group, and the convolutional modules corresponding to at least one group are independent of each other;

[0055] At least two groups contain the same number of input channels;

[0056] At least two groups have the same convolutional layer parameters;

[0057] The grouped convolutional module includes, in sequence, a predecessor layer, a first channel shuffling layer, and a subsequent convolutional module for each group;

[0058] The input to the first channel shuffling layer is the output of at least two groups of precursor layers, and the output of the first channel shuffling layer is the input of at least two groups of subsequent convolutional modules;

[0059] The precursor layer includes at least one convolutional module for convolutional processing of at least one set of features;

[0060] The first channel shuffling layer is used to perform cross-group permutation of the output feature map of the predecessor layer in the channel dimension according to the first channel rearrangement rule;

[0061] The grouped convolutional module includes a second channel shuffling layer and a convolutional module for each group, and the input of the convolutional module for each group is the output of the second channel shuffling layer;

[0062] The second channel shuffling layer is used to perform cross-group permutation of the input feature map, which includes at least two groups of features, in the channel dimension according to the second channel rearrangement rule;

[0063] The grouped convolutional module sequentially includes a second channel shuffling layer, a precursor layer for each group, a first channel shuffling layer, and a subsequent convolutional module for each group.

[0064] Optionally, step S12 includes at least one of the following:

[0065] At least one first intermediate value is determined or obtained based on a neural network and at least one input feature, and at least one image block is filtered based on the at least one first intermediate value and at least one lookup table;

[0066] At least one second intermediate value is determined or obtained based on at least one input feature and at least one lookup table, and at least one image patch is filtered based on a neural network and at least one second intermediate value;

[0067] Determine or generate at least one index based on at least one input feature, and perform filtering on at least one image patch based on at least one index and at least one lookup table;

[0068] At least one third intermediate value is determined or obtained based on a neural network and at least one input feature; at least one fourth intermediate value is determined or obtained based on at least one third intermediate value and at least one lookup table; and at least one image patch is filtered based on a neural network and at least one fourth intermediate value.

[0069] At least one fifth intermediate value is determined or obtained based on at least one input feature and at least one lookup table, at least one sixth intermediate value is determined or obtained based on at least one fifth intermediate value and a neural network, and at least one image patch is filtered based on at least one sixth intermediate value and at least one lookup table.

[0070] Optionally, the derived block is determined or obtained by at least one of the following:

[0071] Cropping results obtained by cropping the reconstructed block and / or predicted block of at least one image patch;

[0072] The filling result of filling the reconstructed block and / or predicted block of at least one image patch;

[0073] The update result of pixel updates for at least one image patch's reconstructed block and / or predicted block;

[0074] The translation result of pixel translation of at least one image patch's reconstructed block and / or predicted block;

[0075] The result of pixel-by-pixel editing of at least one image patch's reconstructed block and / or predicted block.

[0076] This application also provides a processing apparatus, including:

[0077] The processing module is used to filter at least one image patch based on multiple input branches, a neural network, and / or a lookup table.

[0078] This application also provides a processing device, including: a memory and a processor, wherein the memory stores an image processing program, and when the image processing program is executed by the processor, it implements the steps of any of the image processing methods described above.

[0079] This application also provides a storage medium storing a computer program that, when executed by a processor, implements the steps of any of the image processing methods described above.

[0080] As described above, the image processing method of this application can be applied to a processing device, including: performing filtering processing on at least one image block based on multiple input branches, a neural network, and / or a lookup table. Through the technical solution of this application, when performing filtering processing on at least one image block using a neural network and / or a lookup table, multiple input branches are comprehensively considered, which can reduce the complexity of the filtering process and thus support improved efficiency in video encoding and / or decoding. Attached Figure Description

[0081] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application. To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the description of the embodiments will be briefly introduced below. Obviously, those skilled in the art can obtain other drawings based on these drawings without any creative effort.

[0082] Figure 1 A schematic diagram of the hardware structure of a mobile terminal to implement the various embodiments of this application;

[0083] Figure 2 A communication network system architecture diagram provided in this application embodiment;

[0084] Figure 3 A schematic diagram of the hardware structure of a controller 140 provided in this application;

[0085] Figure 4 A schematic diagram of the hardware structure of a network node 150 provided in this application;

[0086] Figure 5 A schematic diagram of DCT transformation provided in this application;

[0087] Figure 6 This is a flowchart illustrating the image processing method according to the first embodiment;

[0088] Figure 7 A schematic diagram of the encoding and decoding process in the image processing method provided in this application;

[0089] Figure 8 This is a flowchart illustrating the image processing method according to the second embodiment;

[0090] Figure 9 A schematic diagram of the architecture for processing based on a neural network and a first input branch provided for this application;

[0091] Figure 10 This is a schematic diagram of the processing module of the processing device.

[0092] The realization of the objectives, functional features, and advantages of this application will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. The accompanying drawings have illustrated specific embodiments of this application, which will be described in more detail below. These drawings and textual descriptions are not intended to limit the scope of the concept in any way, but rather to illustrate the concepts of this application to those skilled in the art through reference to specific embodiments. Detailed Implementation

[0093] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.

[0094] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element. Furthermore, components, features, and elements with the same names in different embodiments of this application may have the same meaning or different meanings, the specific meaning of which must be determined by its interpretation in that specific embodiment or further in conjunction with the context of that specific embodiment.

[0095] It should be understood that although the terms first, second, third, etc., may be used herein to describe various information, such information should not be limited to these terms. These terms are used only to distinguish information of the same type from one another. For example, without departing from the scope of this document, first information may also be referred to as second information, and similarly, second information may also be referred to as first information. Depending on the context, the word “if” as used herein may be interpreted as “when…” or “in response to determination”. Furthermore, as used herein, the singular forms “a,” “an,” and “the” are intended to also include the plural forms unless the context indicates otherwise. It should be further understood that the terms “comprising,” “including,” indicate the presence of the stated feature, step, operation, element, component, item, kind, and / or group, but do not exclude the presence, occurrence, or addition of one or more other features, steps, operations, elements, components, items, kinds, and / or groups. The terms “or,” “and / or,” “including at least one of the following,” etc., as used in this application may be interpreted as inclusive, or mean any one or any combination thereof. For example, "including at least one of the following: A, B, C" means "any one of the following: A; B; C; A and B; A and C; B and C; A and B and C." Similarly, "A, B, or C" or "A, B, and / or C" means "any one of the following: A; B; C; A and B; A and C; B and C; A and B and C." Exceptions to this definition only occur when the combination of elements, functions, steps, or operations is inherently mutually exclusive in some way.

[0096] It should be understood that although the steps in the flowcharts of this application's embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some of the steps in the figures may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times, and their execution order is not necessarily sequential, but can be performed alternately or in turn with other steps or at least a portion of the sub-steps or stages of other steps.

[0097] Depending on the context, the words “if” or “suppose” as used here can be interpreted as “when” or “in response to determination” or “in response to detection.” Similarly, depending on the context, the phrases “if determination” or “if detection (of the stated condition or event)” can be interpreted as “when determination” or “in response to determination” or “when detection (of the stated condition or event)” or “in response to detection (of the stated condition or event).”

[0098] It should be noted that step designations such as S10 and S20 are used in this document for the purpose of more clearly and concisely describing the corresponding content, and do not constitute a substantial limitation on the order. In specific implementation, those skilled in the art may execute S20 first and then S10, etc., but these should all be within the protection scope of this application.

[0099] It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to limit this application.

[0100] In the following description, the use of suffixes such as "module," "part," or "unit" to denote elements is solely for the purpose of illustrative purposes and has no specific meaning in itself. Therefore, "module," "part," or "unit" may be used interchangeably.

[0101] The processing device in this application can be a smart terminal or a server, and the smart terminal can be implemented in various forms. For example, the smart terminal described in this application can include smart terminals such as mobile phones, tablets, laptops, handheld computers, personal digital assistants (PDAs), portable media players (PMPs), navigation devices, wearable devices, smart bracelets, pedometers, etc., as well as fixed terminals such as digital TVs and desktop computers.

[0102] The following description will use a mobile terminal as an example. Those skilled in the art will understand that, apart from elements specifically designed for mobile purposes, the construction according to the embodiments of this application can also be applied to fixed-type terminals.

[0103] Please see Figure 1 This is a schematic diagram of the hardware structure of a mobile terminal implementing various embodiments of this application. The mobile terminal 100 may include: an RF (Radio Frequency) unit 101, a WiFi module 102, an audio output unit 103, an A / V (Audio / Video) input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, a processor 110, and a power supply 111, etc. Those skilled in the art will understand that... Figure 1 The mobile terminal structure shown does not constitute a limitation on the mobile terminal. The mobile terminal may include more or fewer components than shown, or combine certain components, or have different component arrangements.

[0104] The following is combined with Figure 1 A detailed introduction to each component of the mobile terminal:

[0105] The radio frequency unit 101 can be used for receiving and transmitting signals during information transmission or calls. Specifically, it receives downlink information from the base station and processes it with the processor 110; additionally, it transmits uplink data to the base station. Typically, the radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low-noise amplifier, and a duplexer. Furthermore, the radio frequency unit 101 can also communicate wirelessly with networks and other devices. The aforementioned wireless communications may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA2000 (Code Division Multiple Access 2000), WCDMA (Wideband Code Division Multiple Access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access), FDD-LTE (Frequency Division Duplexing-Long Term Evolution), TDD-LTE (Time Division Duplexing-Long Term Evolution), 5G, and 6G.

[0106] WiFi is a short-range wireless transmission technology. Mobile terminals, through the WiFi module 102, can help users send and receive emails, browse web pages, and access streaming media, providing users with wireless broadband internet access. Although Figure 1 WiFi module 102 is shown, but it is understood that it is not a necessary component of a mobile terminal and can be omitted as needed without changing the nature of the invention.

[0107] The audio output unit 103 can convert audio data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into audio signals and output them as sound when the mobile terminal 100 is in call signal receiving mode, call mode, recording mode, voice recognition mode, broadcast receiving mode, etc. Furthermore, the audio output unit 103 can also provide audio output related to specific functions performed by the mobile terminal 100 (e.g., call signal receiving sound, message receiving sound, etc.). The audio output unit 103 may include a speaker, a buzzer, etc.

[0108] The A / V input unit 104 is used to receive audio or video signals. The A / V input unit 104 may include a graphics processing unit (GPU) 1041 and a microphone 1042. The GPU 1041 processes image data of still images or videos acquired by an image capture device (such as a camera) in video capture mode or image capture mode. The processed image frames can be displayed on the display unit 106. The image frames processed by the GPU 1041 can be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 can receive sound (audio data) in operating modes such as telephone call mode, recording mode, and voice recognition mode, and can process such sound into audio data. The processed audio (voice) data can be converted into a format that can be transmitted to a mobile communication base station via the radio frequency unit 101 in telephone call mode. The microphone 1042 can implement various types of noise cancellation (or suppression) algorithms to eliminate (or suppress) noise or interference generated during the reception and transmission of audio signals.

[0109] The mobile terminal 100 also includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Optionally, the light sensor includes an ambient light sensor and a proximity sensor. Optionally, the ambient light sensor can adjust the brightness of the display panel 1061 according to the ambient light level, and the proximity sensor can turn off the display panel 1061 and / or backlight when the mobile terminal 100 is moved to the ear. As a type of motion sensor, an accelerometer sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when stationary. It can be used for applications that recognize the phone's posture (such as landscape / portrait switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer, tapping), etc. Other sensors that may be configured in the phone, such as fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, and infrared sensors, will not be described in detail here.

[0110] The display unit 106 is used to display information input by the user or information provided to the user. The display unit 106 may include a display panel 1061, which may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.

[0111] User input unit 107 can be used to receive input numerical or character information, and generate key signal inputs related to user settings and function control of the mobile terminal. Optionally, user input unit 107 may include touch panel 1071 and other input devices 1072. Touch panel 1071, also known as a touch screen, can collect touch operations performed by the user on or near it (such as operations performed by the user using a finger, stylus, or any suitable object or accessory on or near touch panel 1071), and drive corresponding connection devices according to a pre-set program. Touch panel 1071 may include a touch detection device and a touch controller. Optionally, the touch detection device detects the user's touch position and the signal generated by the touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into touch point coordinates, sends it to processor 110, and can receive and execute commands sent by processor 110. In addition, touch panel 1071 can be implemented using various types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 1071, the user input unit 107 may also include other input devices 1072. Optionally, other input devices 1072 may include, but are not limited to, one or more of the following: physical keyboard, function keys (such as volume control buttons, power buttons, etc.), trackball, mouse, joystick, etc., without being specifically limited here.

[0112] Optionally, the touch panel 1071 may cover the display panel 1061. When the touch panel 1071 detects a touch operation on or near it, it transmits the information to the processor 110 to determine the type of touch event. Subsequently, the processor 110 provides corresponding visual output on the display panel 1061 based on the type of touch event. Although in Figure 1 In this embodiment, the touch panel 1071 and the display panel 1061 are two independent components to realize the input and output functions of the mobile terminal. However, in some embodiments, the touch panel 1071 and the display panel 1061 can be integrated to realize the input and output functions of the mobile terminal. The specific implementation is not limited here.

[0113] Interface unit 108 serves as an interface through which at least one external device can connect to mobile terminal 100. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device with an identification module, an audio input / output (I / O) port, a video I / O port, a headphone port, and so on. Interface unit 108 may be used to receive input (e.g., data, power, etc.) from the external device and transmit the received input to one or more elements within mobile terminal 100, or it may be used to transmit data between mobile terminal 100 and the external device.

[0114] The memory 109 can be used to store software programs and various data. The memory 109 may primarily include a program storage area and a data storage area. Optionally, the program storage area may store the operating system, applications required for at least one function (such as sound playback, image playback, etc.), etc.; the data storage area may store data created based on the use of the mobile phone (such as audio data, phonebook, etc.). Furthermore, the memory 109 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device.

[0115] The processor 110 is the control center of the mobile terminal. It connects various parts of the mobile terminal via various interfaces and lines. By running or executing software programs and / or modules stored in the memory 109, and by calling data stored in the memory 109, it performs various functions and processes data of the mobile terminal, thereby providing overall monitoring of the mobile terminal. The processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor and a modem processor. Optionally, the application processor mainly handles the operating system, user interface, and applications, while the modem processor mainly handles wireless communication. It is understood that the modem processor may not be integrated into the processor 110.

[0116] The mobile terminal 100 may also include a power supply 111 (such as a battery) that supplies power to various components. Preferably, the power supply 111 can be logically connected to the processor 110 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system.

[0117] although Figure 1 As not shown, the mobile terminal 100 may also include a Bluetooth module, etc., which will not be described in detail here.

[0118] To facilitate understanding of the embodiments of this application, the communication network system on which the mobile terminal of this application is based is described below.

[0119] Please see Figure 2 , Figure 2 This application provides a communication network system architecture diagram. The communication network system is an LTE system based on the universal mobile communication technology. The LTE system includes a UE (User Equipment) 201, an E-UTRAN (Evolved UMTS Terrestrial Radio Access Network) 202, an EPC (Evolved Packet Core) 203, and the operator's IP services 204, which are connected in sequence.

[0120] Optionally, UE201 can be the aforementioned terminal 100, which will not be described in detail here.

[0121] E-UTRAN202 includes eNodeB2021 and other eNodeB2022, etc. Optionally, eNodeB2021 can connect to other eNodeB2022 via backhaul (e.g., X2 interface), and eNodeB2021 connects to EPC203, providing access from UE201 to EPC203.

[0122] EPC203 may include MME (Mobility Management Entity) 2031, HSS (Home Subscriber Server) 2032, other MMEs 2033, SGW (Serving Gateway) 2034, PGW (Packet Data Network Gateway) 2035, and PCRF (Policy and Charging Rules Function) 2036, etc. Optionally, MME2031 is the control node that handles signaling between UE201 and EPC203, providing bearer and connection management. HSS2032 is used to provide registers to manage functions such as the Home Location Register (not shown in the figure) and stores user-specific information such as service characteristics and data rates. All user data can be sent through SGW2034. PGW2035 can provide UE 201 IP address allocation and other functions. PCRF2036 is the policy and charging control decision point for service data flow and IP bearer resources. It selects and provides available policy and charging control decisions for the policy and charging enforcement function unit (not shown in the figure).

[0123] IP services 204 may include the Internet, intranet, IMS (IP Multimedia Subsystem), or other IP services.

[0124] Although the above description uses the LTE system as an example, those skilled in the art should know that this application is not only applicable to the LTE system, but also to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA, 5G and future new network systems (such as 6G), etc., without limitation.

[0125] Figure 3This is a schematic diagram of the hardware structure of a controller 140 provided in this application. The controller 140 includes a memory 1401 and a processor 1402. The memory 1401 is used to store program instructions, and the processor 1402 is used to call the program instructions in the memory 1401 to execute the steps performed by the controller in the first embodiment of the above method. The implementation principle and beneficial effects are similar, and will not be described again here.

[0126] Optionally, the controller further includes a communication interface 1403, which can be connected to the processor 1402 via a bus 1404. The processor 1402 can control the communication interface 1403 to implement the receiving and sending functions of the controller 140.

[0127] Figure 4 This application provides a schematic diagram of the hardware structure of a network node 150. The network node 150 includes a memory 1501 and a processor 1502. The memory 1501 is used to store program instructions, and the processor 1502 is used to call the program instructions in the memory 1501 to execute the steps performed by the first node in the first embodiment of the above method. The implementation principle and beneficial effects are similar, and will not be described again here.

[0128] Optionally, the controller further includes a communication interface 1503, which can be connected to the processor 1502 via a bus 1504. The processor 1502 can control the communication interface 1503 to implement the receiving and sending functions of the network node 150.

[0129] The integrated modules described above, implemented as software functional modules, can be stored in a computer-readable storage medium. These software functional modules, stored in a storage medium, include several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute some steps of the methods of the various embodiments of this application.

[0130] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of this application is generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a storage medium or transmitted from one storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state disk, SSD), etc.

[0131] Based on the above-described mobile terminal hardware structure and communication network system, various embodiments of this application are proposed.

[0132] Optionally, a brief introduction will be given regarding the neural networks that may be involved in the embodiments of this application.

[0133] The NNLF (Neural Network based Loop Filter) structure in NNVC (Neural Network Based Video Coding) includes two structures: HOP (High Complexity Operation Point) and LOP (Low Complexity Operation Point). For the LOP structure, the input to the loop filter can include reconstructed luma and chroma samples (Rec), predicted luma and chroma samples (Pred), boundary intensity information (BS) for luma and chroma, basic quantization parameters (QPbase), slice quantization parameters (QPslice), and block prediction information (IPB).

[0134] The luminance samples input by Rec and Pred need to undergo transformation processing. Specifically, for each input luminance block with a size of W (width) × H (height), a 2×2 DCT-II transformation is applied to each 2×2 sub-block, and the result is reconstructed into a tensor of size (W / 2) × (H / 2) × 4, where 4 represents the four frequency channels. Subsequently, the transformed luminance sample is concatenated with the chroma U and V channels to form a feature tensor of size (W / 2) × (H / 2) × 6.

[0135] Optionally, when constructing reconstructed sample examples, eight neighboring samples can be expanded in each direction to create a 144×144 image patch. This 144×144×1 input luminance patch is then transformed and reconstructed into a 72×72×4 tensor. For chroma reconstruction samples, since the values ​​within each 2×2 sub-block remain constant due to a 2x upsampling, no transformation is required. They can be directly downsampled by a 2x to obtain a 72×72×2 tensor. To maintain compatibility with the current training process, the 2x upsampling operation for chroma is retained because the chroma data was already oversampled and stored along with the luminance data in the training data. In a more efficient implementation, both upsampling and downsampling operations can be omitted without affecting the final result.

[0136] For QPbase (basic quantization parameters) and QPslice (piece quantization parameters), since they remain constant throughout the entire image patch, no transformation is required; they are directly reconstructed into a tensor of (W / 2)×(H / 2)×1. For IPB (block prediction information), since it only contains brightness information, it is transformed and reconstructed into a tensor of (W / 2)×(H / 2)×4.

[0137] The transformed input data is processed through convolutional layers with kernel sizes of 3×3 or 1×1, and then concatenated to achieve feature fusion and transformation. In the fusion and transformation module, separable convolutions of 1×3 and 3×1 and a downsampling operation with a factor of 2 are used. The network then splits into two branches, one for processing luminance information and the other for processing chrominance information.

[0138] It can be seen that NNLF based on 2x2 DCT transform only processes independent, non-overlapping image blocks during image feature extraction, causing it to completely ignore the spatial information of overlapping pixel regions at the boundaries of adjacent DCT (Discrete Cosine Transform) blocks. In other words, it divides the image block into non-overlapping 2x2 sub-blocks, and performs DCT transformation independently on each sub-block, failing to consider the cross-block correlation of pixels at the boundaries of adjacent sub-blocks (such as edge continuity, texture consistency, etc.). That is, it does not consider overlapping region information, which can be referred to as overlapping region information or sub-block boundary information.

[0139] like Figure 5 As shown, the DCT transform only considers the information of solid-line boxes in the image patch, while ignoring the information within the dashed-line boxes. This lack of information makes it impossible for the filter to effectively capture the transition features between image patches, resulting in the loss of key structural details during feature extraction. This, in turn, affects the ability of subsequent filtering operations to smooth the boundaries of image patches and the overall reconstruction quality.

[0140] Optionally, since traditional NNLF uses 2x2 DCT transform to extract image features from image blocks, its performance is limited due to neglecting the overlapping region information of image blocks. Therefore, in this embodiment, at least one derived block of an image block can be obtained by cropping and padding the image block. This allows the derived block to fully integrate the spatial correlation of pixels at the boundary of adjacent DCT blocks. The derived block can contain the overlapping region information or sub-block boundary information of the image block, thereby improving filtering performance while maintaining low complexity. And / or, for the defect that adding an input branch will increase network complexity when the Y and UV channels are processed independently in the original NNLF, this embodiment will also set corresponding optimized channel processing logic, such as designing a network architecture that fuses the Y component with the new Y component and the UV component with the new UV component in parallel, to effectively reduce computational redundancy and achieve a balance between model complexity and performance optimization.

[0141] Optionally, in the embodiments of this application, a cross-block modeling and multi-directional extraction mechanism for overlapping pixel features can be implemented.

[0142] It can overcome the limitations of 2x2 DCT in not processing sub-block boundary information, and / or capture the overlapping region information of adjacent block boundaries in horizontal, vertical, and horizontal-vertical mixed directions through operations such as cropping and padding (e.g., RecOverlap (overlap region information of reconstructed blocks) and PredOverlap (overlap region information of predicted blocks)). This enables the filter to perceive the spatial correlation of cross-block pixels (e.g., edge continuity, texture transition features), avoiding the defect of incomplete feature extraction caused by block fragmentation. Furthermore, by designing a multi-directional overlapping information extraction process and using lightweight operations such as zero padding and mirror symmetric padding, it can expand the perception range of DCT transform with almost no increase in parameters, thereby improving the spatial context integrity of features.

[0143] Optionally, in the embodiments of this application, a low-complexity multi-component fusion architecture based on grouped convolution can be constructed.

[0144] Optionally, to avoid the increased computational load caused by adding overlapping region information or sub-block boundary information, in this embodiment, the Y luminance component (original Y + overlapping Y) and UV chromaticity components (original UV + overlapping UV) can be integrated into an independent branch, avoiding redundant computation in traditional independent channel processing. In the feature fusion stage, a grouped convolution with a grouping coefficient of 2 is introduced, dividing the input channel into two groups for parallel processing. While maintaining the interaction of multi-component features, this reduces the computational load by approximately 50%, achieving a balance between complexity and performance optimization.

[0145] Optionally, in the embodiments of this application, a multi-branch collaborative channel feature joint optimization mechanism can be implemented.

[0146] In the Y branch, the DCT features of the original block and the overlapping block are concatenated, and then feature fusion is achieved through grouped convolution to enhance the edge structure information dominated by brightness. In the UV branch, a similar process is used to preserve the color transition details of chroma, while cross-branch parameter sharing further reduces the overall complexity. This approach overcomes the limitation of independent processing of the Y and UV channels in traditional LOP models by using component concatenation and grouped convolution to achieve synergistic optimization of brightness and chroma features in overlapping regions, thereby improving the overall consistency of image reconstruction.

[0147] First Embodiment

[0148] Reference Figure 6 , Figure 6 This is a flowchart illustrating the image processing method according to the first embodiment. The image processing method of this application embodiment can be applied to a processing device, including step S10:

[0149] S10, filtering is performed on at least one image patch based on multiple input branches, a neural network, and / or a lookup table.

[0150] In this embodiment, the processing device can be a smart terminal, such as a mobile phone or computer, or a server, such as a local server or a cloud server. This embodiment and this application primarily use a smart terminal as an example for illustration.

[0151] Optionally, the technical solution of this embodiment can be applied to fields such as image encoding and decoding, video encoding and decoding, hardware video encoding and decoding, dedicated circuit video encoding and decoding, and real-time video encoding and decoding.

[0152] Optionally, for ease of understanding, a brief introduction to the encoding and decoding process is provided: (e.g.) Figure 7As shown, the system includes modules such as general coding control, transform and quantization, intra-frame estimation, intra-frame prediction, motion compensation, motion estimation, inverse quantization and inverse transform, filter control analysis, deblocking filtering and SAO filtering (i.e., loop filtering), entropy coding, and a decoding frame buffer. Optionally, the motion compensation module can perform intra-frame / inter-frame selection to determine the specific compensation. Optionally, during entropy coding, the coding bit rate is obtained based on the general control data determined by the general coding control module, the change quantization coefficients determined by the transform and quantization module, the intra-frame prediction data and filter control determined by the filter control analysis, and the motion data determined by the decoding frame buffer.

[0153] Optionally, the decoded video signal can be output through a decoding frame buffer.

[0154] Optionally, the loop filtering may include two branches: a deblocking filter and an LC-NNLF. The results of the branches are then fused and processed by SAO (Sample Adaptive Offset) and ALF (Adaptive Loop Filter).

[0155] Optionally, the multiple input branches can be different input branches used to input into the neural network and / or lookup table. Multiple input branches can be constructed according to the luminance and chrominance components, and the luminance and chrominance information corresponding to at least one image patch can be input into the neural network and / or lookup table for filtering according to their respective input branches. Alternatively, other rules can be followed, such as comparing the reference pixel value of at least one image patch with a pixel threshold; reference pixel values ​​greater than the pixel threshold can be input into one input branch of the neural network and / or lookup table, while reference pixel values ​​less than or equal to the pixel threshold can be input into another input branch of the neural network and / or lookup table for filtering, etc.

[0156] Optionally, the neural network can be a fully connected layer-based neural network; a convolutional layer-based neural network; a Transformer-based neural network; or a hybrid neural network combining convolutional, fully connected, and Transformer layers, etc. This embodiment uses a convolutional layer-based neural network as an example.

[0157] Optionally, the lookup table can be a filter lookup table, which may include filter modes and / or filter pixel values, etc.

[0158] Optionally, the input parameters in each input branch within the multi-input branch can be determined based on at least one image patch, or based on a reconstructed block and / or a predicted block of at least one image patch, or based on a derived block of at least one reconstructed block and / or a predicted block.

[0159] Optionally, a derived block can be determined or obtained based on the cropping result of cropping the predicted block and / or reconstructed block of at least one image block.

[0160] Optionally, the derived block can be determined or obtained based on the filling result of filling the predicted block and / or reconstructed block of at least one image block.

[0161] Optionally, a derived block can be determined or obtained based on the results of pixel editing, and / or pixel translation, and / or pixel updating of a predicted block and / or reconstructed block of at least one image block.

[0162] Optionally, when filtering at least one image block, the derived blocks of the at least one image block can be considered in combination, thereby realizing the spatial correlation information of the overlapping pixel regions at the boundary of adjacent DCT blocks when filtering at least one image block through the derived blocks.

[0163] Optionally, a transformation process can be performed on at least one image patch's derived blocks, such as performing DCT operations on the derived blocks to extract derived block information. This derived block information may include RecOverlap (reconstructed block overlap region information) and PredOverlap (predicted block overlap region information), reconstructed block sub-block boundary information, predicted block sub-block boundary information, etc. The derived block information may include overlap region information or sub-block boundary information. The derived block transformation features are determined based on the derived block information, for example, by directly using the pixel values ​​and / or pixel positions of the derived blocks as the derived block transformation features.

[0164] Optionally, at least one reconstructed block and / or prediction block can be transformed to obtain corresponding reconstructed transformation features and prediction transformation features.

[0165] Optionally, features in each input branch of the multi-input branch can be determined based on at least one of the reconstructed transformation features, predicted transformation features, and derived block transformation features, and used as input parameters. Then, filtering is performed based on the neural network and / or lookup table to obtain the filtered target image block.

[0166] Optionally, the processing device can be a decoding end, whereby at least one image patch can be filtered based on multiple input branches, neural networks, and / or lookup tables.

[0167] Optionally, the processing device may be an encoding end, whereby at least one image patch may be filtered based on multiple input branches, a neural network, and / or a lookup table.

[0168] In the embodiments of this application, at least one image block is filtered based on multiple input branches, neural networks and / or lookup tables. This ensures that when filtering at least one image block using neural networks and / or lookup tables, multiple input branches are taken into account, which can reduce the complexity of the filtering process and thus support improved efficiency of video encoding and / or decoding.

[0169] Second Embodiment

[0170] Based on the first embodiment, a second embodiment is proposed.

[0171] In this embodiment, the image patch features include at least one of the following: the reconstruction transform features of the reconstructed block of at least one image patch, the prediction transform features of the predicted block of at least one image patch, and the derivative transform features of the derived block of at least one image patch.

[0172] Optionally, at least one reconstruction block may be transformed, such as at least one of the following: Discrete Cosine Transform (DCT), Discrete Sine Transform (DST), Karhunen–Loève Transform (KL), Wavelet Transform, and Hadamard Transform.

[0173] Optionally, pixel features can be extracted from at least one reconstructed block after transformation to obtain reconstructed transformation features, such as pixel positions and pixel values ​​of the reconstructed block.

[0174] Optionally, at least one prediction block can be transformed.

[0175] Optionally, pixel features can be extracted from at least one prediction block after transformation to obtain prediction transformation features, such as the pixel position and pixel value of the prediction block.

[0176] Optionally, the derived blocks of at least one image block may include at least one of the derived blocks of at least one reconstructed block and at least one derived block of at least one predicted block.

[0177] Optionally, at least one derived block can be transformed, and pixel features can be extracted from the transformed at least one derived block to obtain derived transformation features, such as the pixel position and pixel value of the derived block.

[0178] Optionally, the transformation methods for at least two of the reconstructed block, the predicted block, and the derived block can be the same or different, and there is no restriction on this.

[0179] In this embodiment, by determining that at least one of the reconstructed transformation features, predicted transformation features, and derived transformation features is an image patch feature, it is easier to perform channel stitching based on the image patch features combined with multiple input branches, and then perform filtering processing by combining neural networks and / or lookup tables.

[0180] Alternatively, the channels in this embodiment can be used for lookup tables.

[0181] Optionally, a channel is a component of the feature map of an image patch in the depth dimension, used to describe the feature representation of the number of features in a specific dimension. Each channel represents a certain feature (such as texture, edge, derived distribution, etc.) extracted from the image patch (such as a prediction patch or a reconstruction patch). For example, one channel may be used to detect horizontal edges, and another channel may be used to detect vertical edges, etc.

[0182] Optionally, the image patch features of at least two channels can be concatenated to obtain the first multi-channel feature. Then, based on the pre-set correspondence between the channel features and the index, the index corresponding to the first multi-channel feature (such as a one-dimensional index, a two-dimensional index, or a three-dimensional index) can be determined or obtained, and it can be input into a lookup table for searching to determine or obtain the target image patch after filtering.

[0183] Optionally, the derived block is determined or obtained through at least one of the following steps a1-a5:

[0184] Step a1: Cropping result of cropping the reconstructed block and / or predicted block of at least one image patch;

[0185] Optionally, for a derived block of at least one reconstructed block, the boundary region of at least one reconstructed block (such as at least one column on the leftmost side of the reconstructed block, at least one column on the rightmost side of the reconstructed block, at least one row on the top side of the reconstructed block, at least one row on the bottom side of the reconstructed block, etc.) can be clipped to obtain the clipping result.

[0186] Optionally, the cropping result may include at least one cropped reconstructed block.

[0187] Optionally, a derivative block of at least one reconstructed block can be determined or obtained based on the at least one reconstructed block that has been clipped, such as by zero-padding the at least one reconstructed block that has been clipped.

[0188] Optionally, the size parameters of at least one reconstructed block's derived block are matched with the size parameters of at least one reconstructed block. For example, the width of reconstructed block n is the same as the width of the derived block of reconstructed block n, and the height of reconstructed block n is the same as the height of the derived block of reconstructed block n.

[0189] Optionally, for at least one predicted block,

[0190] The boundary region of at least one prediction block (such as at least one of the leftmost column of the prediction block, at least one rightmost column of the prediction block, at least one top row of the prediction block, at least one bottom row of the prediction block, etc.) can be cropped to obtain the cropping result.

[0191] Optionally, the cropping result may include at least one cropped prediction block.

[0192] Optionally, a derivative block of at least one predicted block can be determined or obtained based on the at least one predicted block that has been pruned, such as by performing mirror-symmetric element filling on the at least one predicted block that has been pruned.

[0193] Optionally, the size parameters of the derived blocks of at least one prediction block are matched with the size parameters of at least one prediction block. For example, the width of prediction block m is the same as the width of the derived blocks of prediction block m, and the height of prediction block m is the same as the height of the derived blocks of prediction block m.

[0194] In this embodiment, by cropping the reconstructed block and / or predicted block of at least one image block, a derived block is determined or obtained, ensuring that the derived block is closely related to the reconstructed block and / or predicted block. This facilitates subsequent use of the derived block transformation features, combined with multiple input branches, neural networks, and / or lookup tables, for filtering, thus ensuring the effectiveness of the filtering process.

[0195] Step a2, filling results of the reconstructed block and / or predicted block of at least one image patch;

[0196] Optionally, the reconstructed block and / or predicted block of at least one image patch may be cropped, and then the cropped reconstructed block and / or predicted block may be filled. The filling method may include at least one of zero filling and mirror symmetric filling.

[0197] For example, the leftmost and rightmost columns of at least one reconstructed block and / or prediction block can be cropped, and then two columns of pixels can be filled on the right side of the at least one reconstructed block and / or prediction block. Alternatively, the top and bottom rows of at least one reconstructed block and / or prediction block can be cropped, and then two columns of pixels can be filled at the bottom of the at least one reconstructed block and / or prediction block.

[0198] Optionally, pixel cleaning can be performed on the even-numbered columns of the boundaries (such as the leftmost and / or rightmost) of at least one reconstructed block and / or prediction block, and then pixel filling can be performed on the pixel-cleaned areas in the boundaries of at least one reconstructed block and / or prediction block. The filling method may include at least one of zero filling and mirror symmetric filling.

[0199] Optionally, pixel cleaning can be performed on the even-numbered rows of the boundaries (such as the topmost and / or bottommost) of at least one reconstructed block and / or prediction block, and then pixel filling can be performed on the regions in the boundaries of at least one reconstructed block and / or prediction block that have undergone pixel cleaning. The filling method may include at least one of zero filling and mirror symmetric filling.

[0200] Optionally, the filling result includes at least one reconstructed block that has been filled, and can be used as a derived block corresponding to the reconstructed block.

[0201] Optionally, the filling result includes at least one filled prediction block, which can be used as a derived block corresponding to the prediction block.

[0202] In this embodiment, a derived block is determined or obtained based on the filling result of filling the reconstructed block and / or predicted block of at least one image block. This ensures that the derived block is closely related to the reconstructed block and / or predicted block, which facilitates subsequent filtering by utilizing the derived block transformation features of the derived block, combined with multiple input branches, neural networks and / or lookup tables, thus ensuring the effectiveness of the filtering process.

[0203] Step a3: The update result of pixel update for at least one image patch's reconstructed block and / or predicted block;

[0204] Optionally, pixel updates can be performed on even columns of the boundaries (such as the leftmost and / or rightmost) of at least one reconstructed block and / or predicted block, for example, updating all of them to zero pixels.

[0205] Optionally, pixel updates can be performed on even-numbered rows of the boundaries (such as the topmost and / or bottommost) of at least one reconstructed block and / or predicted block, for example, updating all rows to zero pixels.

[0206] Optionally, the update result includes at least one reconstructed block that has been pixel-updated, and can be used as a derived block corresponding to the reconstructed block.

[0207] Optionally, the update result includes at least one predicted block that has been pixel-updated, and can be used as the derived block corresponding to the predicted block.

[0208] In this embodiment, a derived block is determined or obtained based on the update result of pixel updates of at least one image block's reconstructed block and / or predicted block. This ensures that the derived block is closely related to the reconstructed block and / or predicted block, which facilitates subsequent use of the derived block's transformed features, combined with multiple input branches, neural networks, and / or lookup tables, for filtering processing, thus ensuring the effectiveness of the filtering process.

[0209] Step a4: Perform pixel translation on the reconstructed block and / or predicted block of at least one image patch;

[0210] Optionally, pixel cleaning can be performed on the even-numbered columns of the boundaries (such as the leftmost and / or rightmost) of at least one reconstructed block and / or prediction block, and then pixel translation can be performed on the regions in the boundaries of at least one reconstructed block and / or prediction block that have undergone pixel cleaning.

[0211] Optionally, pixel cleaning can be performed on the even-numbered rows of the boundaries (such as the topmost and / or bottommost) of at least one reconstructed block and / or prediction block, and then pixel translation can be performed on the regions in the boundaries of at least one reconstructed block and / or prediction block that have undergone pixel cleaning.

[0212] Alternatively, pixel translation can be performed by copying pixels from adjacent areas of the same size that have undergone pixel cleaning and then translating them to the area that has undergone pixel cleaning.

[0213] Optionally, the translation result includes at least one reconstructed block after pixel translation, and can be used as the derived block corresponding to the reconstructed block.

[0214] Optionally, the translation result includes at least one predicted block after pixel translation, and can be used as the derived block corresponding to the predicted block.

[0215] In this embodiment, a derived block is determined or obtained based on the translation result of pixel translation of at least one image block's reconstructed block and / or predicted block. This ensures that the derived block is closely related to the reconstructed block and / or predicted block, which facilitates subsequent use of the derived block's transformation features, combined with multiple input branches, neural networks, and / or lookup tables, for filtering processing, thus ensuring the effectiveness of the filtering process.

[0216] Step a5: The result of pixel editing of the reconstructed block and / or predicted block of at least one image patch.

[0217] Optionally, even columns of the boundaries (such as the leftmost and / or rightmost) of at least one reconstructed block and / or predicted block can be pixel-edited, for example, to zero pixels.

[0218] Optionally, even-numbered rows of the boundaries (such as the topmost and / or bottommost) of at least one reconstructed block and / or predicted block can be pixel-edited, for example, to zero pixels.

[0219] Optionally, the edit result includes at least one reconstructed block that has been pixel-edited, and can be used as a derived block corresponding to the reconstructed block.

[0220] Optionally, the edited result includes at least one predicted block that has been pixel-edited, and can be used as a derived block corresponding to the predicted block.

[0221] In this embodiment, a derived block is determined or obtained based on the editing results of pixel editing of at least one image block's reconstructed block and / or predicted block. This ensures that the derived block is closely related to the reconstructed block and / or predicted block, which facilitates subsequent use of the derived block's transformed features, combined with multiple input branches, neural networks, and / or lookup tables, for filtering processing, thus ensuring the effectiveness of the filtering process.

[0222] In this embodiment, refer to Figure 8 Step S1 includes steps S11 and S12:

[0223] Step S11: Channel splicing is performed based on multiple input branches and at least one image patch feature to determine or obtain the input features;

[0224] Optionally, at least one image patch may correspond to multiple channels, and each channel may correspond to at least one image patch feature.

[0225] Optionally, each input branch in the multi-input branch may include image patch features of at least one channel. The image patch features of at least one input branch can be concatenated according to the channel dimension to obtain multi-channel features, which are then used as input features.

[0226] Optionally, the multiple input branches include at least one of methods one through three:

[0227] Method 1, used for processing the first input branch of the luminance component;

[0228] Optionally, the input branches for channel splicing of different components can be pre-configured.

[0229] Optionally, in the first input branch, the brightness component of the image block can be processed by concatenating each channel. For example, the first feature of the derived block in the brightness component and the second feature of the reconstructed block and / or the predicted block in the brightness component can be concatenated by channel, and then the input of the neural network and / or lookup table can be determined or obtained based on the result of the channel concatenation.

[0230] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the first input branch used to process the luminance component, all information of the luminance component can be integrated together, and then filtered through the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0231] Optionally, in method one, the first input branch includes at least one of methods 1 to 5:

[0232] Method 1 is the first sub-input branch used to process the reconstructed transform features and derived transform features on the luminance component;

[0233] Optionally, the first input branch includes a first sub-input branch, in which the reconstruction transformation features of the reconstruction block on the luminance component and the derivative transformation features of the corresponding derivative block of the reconstruction block can be processed. For example, the reconstruction transformation features and the derivative transformation features can be concatenated by channels to obtain multi-channel features, which can be used as input features to be input into a lookup table and / or a neural network for filtering.

[0234] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the first sub-input branch, the reconstruction transformation features of the reconstructed block of the luminance component and the derivative transformation features of the derived block can be integrated together, and then filtered by the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0235] Method 2 is a second sub-input branch used to process the predicted transform features and derived transform features on the luminance component;

[0236] Optionally, the first input branch includes a second sub-input branch. In the second sub-input branch, the predicted transformation features of the predicted block on the luminance component and the derived transformation features of the corresponding derived block can be processed. For example, the predicted transformation features and the derived transformation features can be concatenated by channels to obtain multi-channel features, which can be used as input features to be input into a lookup table and / or neural network for filtering.

[0237] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the second sub-input branch, the prediction transformation features of the prediction block of the luminance component and the derivative transformation features of the derivative block can be integrated together, and then filtered by the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0238] Method 3 is a third sub-input branch used to process the reconstructed transform features on the luminance component;

[0239] Optionally, the first input branch includes a third sub-input branch, in which the reconstruction transformation features of the reconstruction block can be processed. For example, the channel stitching of all reconstruction transformation features of at least one reconstruction block on the luminance component can be performed, or other processing operations can be performed, such as converting them into a format that can be recognized by the neural network and / or lookup table, so as to obtain the corresponding input features, so as to input them into the lookup table and / or neural network for filtering processing.

[0240] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the third sub-input branch, the reconstruction transformation features of all channels of the luminance component reconstruction block can be integrated together, and then filtered by the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0241] Method 4 is a fourth sub-input branch used to process the predicted transform features on the luminance component;

[0242] Optionally, the first input branch includes a fourth sub-input branch, in which the prediction transformation features of the prediction block can be processed. For example, the prediction transformation features of at least one prediction block in the luminance component can be concatenated into channels, or other processing operations can be performed, such as converting them into a format that can be recognized by the neural network and / or lookup table, so as to obtain the corresponding input features, so as to input them into the lookup table and / or neural network for filtering processing.

[0243] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the fourth sub-input branch, the prediction transformation features of all channels of the prediction block of the luminance component can be integrated together, and then filtered by the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0244] Method 5 is the fifth sub-input branch used to process the derived transform features on the luminance component.

[0245] Optionally, the first input branch includes a fifth sub-input branch, in which the derived transform features corresponding to the derived blocks of the prediction block and / or reconstruction block can be processed. For example, all derived transform features of at least one derived block on the luminance component can be concatenated into channels, or other processing operations can be performed, such as converting them into a format that can be recognized by the neural network and / or lookup table, so as to obtain the corresponding input features, so as to input them into the lookup table and / or neural network for filtering processing.

[0246] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the fifth sub-input branch, the derived transformation features of all channels of the luminance component's derived block can be integrated together, and then filtered through the neural network and / or lookup table, ensuring the effectiveness of the filtering process.

[0247] Method 2 is the second input branch used to process the chromaticity components;

[0248] Optionally, in the second input branch, the chromaticity components of the image patch can be processed by concatenating each channel. For example, the first feature of the derived patch in the chromaticity component and the second feature of the reconstructed patch and / or predicted patch in the chromaticity component can be concatenated by channel, and then the input of the neural network and / or lookup table can be determined or obtained based on the result of the channel concatenation.

[0249] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the second input branch used to process the chroma components, all the information of the chroma components can be integrated together, and then filtered through the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0250] Optionally, in method two, the second input branch includes at least one of methods 6 to 12:

[0251] Method 6 is the sixth sub-input branch used to process the reconstructed transform features and derived transform features on the chromaticity components;

[0252] Optionally, the second input branch includes a sixth sub-input branch, in which the reconstruction transformation features of the reconstructed block on the chroma component and the derivative transformation features of the corresponding derivative block of the reconstructed block can be processed. For example, the reconstruction transformation features and the derivative transformation features can be concatenated by channels to obtain multi-channel features, which can be used as input features to be input into a lookup table and / or a neural network for filtering.

[0253] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the sixth sub-input branch, the reconstruction transformation features of the chroma component reconstruction block and the derivative transformation features of the derivative block can be integrated together, and then filtered by the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0254] Method 7 is the seventh sub-input branch used to process the predicted transform features and derived transform features on the chromaticity components;

[0255] Optionally, the second input branch includes a seventh sub-input branch, in which the predicted transformation features of the predicted block on the luminance component and the derived transformation features of the corresponding derived block can be processed. For example, the predicted transformation features and the derived transformation features can be concatenated by channels to obtain multi-channel features, which can be used as input features to be input into a lookup table and / or neural network for filtering.

[0256] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the seventh sub-input branch, the prediction transform features of the chroma component prediction block and the derivative transform features of the derivative block can be integrated together, and then filtered by the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0257] Method 8 is the eighth sub-input branch used to process the reconstructed transform features on the chromaticity components;

[0258] Optionally, the second input branch includes an eighth sub-input branch, in which the reconstruction transformation features of the reconstruction block can be processed. For example, the channel concatenation of all reconstruction transformation features of at least one reconstruction block on the chroma component can be performed, or other processing operations can be performed, such as converting them into a format that can be recognized by the neural network and / or lookup table, so as to obtain the corresponding input features, so as to input them into the lookup table and / or neural network for filtering processing.

[0259] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the eighth sub-input branch, the reconstruction transformation features of all channels of the chroma component reconstruction block can be integrated together, and then filtered by the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0260] Method 9 is the ninth sub-input branch used to process the predicted transform features on the chromaticity components;

[0261] Optionally, the second input branch includes a ninth sub-input branch, in which the prediction transformation features of the prediction block can be processed. For example, the prediction transformation features of at least one prediction block in the chromaticity component can be concatenated into channels, or other processing operations can be performed, such as converting them into a format that can be recognized by the neural network and / or lookup table, so as to obtain the corresponding input features, so as to input them into the lookup table and / or neural network for filtering processing.

[0262] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the ninth sub-input branch, the prediction transformation features of all channels of the chroma component prediction block can be integrated together, and then filtered by the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0263] Method 10 is the tenth sub-input branch used to process the derived transform features on the chromaticity components;

[0264] Optionally, the second input branch includes a tenth sub-input branch, in which the derived transformation features corresponding to the derived blocks of the prediction block and / or reconstruction block are processed. For example, the channel concatenation of all derived transformation features of at least one derived block on the chroma component can be performed, or other processing operations can be performed, such as converting them into a format that can be recognized by the neural network and / or lookup table, so as to obtain the corresponding input features, so as to input them into the lookup table and / or neural network for filtering processing.

[0265] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the tenth sub-input branch, the derivation transformation features of all channels of the chroma component's derivation block can be integrated together, and then filtered through the neural network and / or lookup table, ensuring the effectiveness of the filtering process.

[0266] Method 11 is used to process the fourth input branch of the U component;

[0267] Optionally, the second input branch includes a fourth input branch.

[0268] Optionally, in the fourth input branch, the U component of the image patch can be processed by concatenating each channel. For example, the first feature of the derived patch on the U component and the second feature of the reconstructed patch and / or predicted patch on the U component can be concatenated by channel. Then, the input of the neural network and / or lookup table can be determined or obtained based on the result of the channel concatenation.

[0269] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the fourth input branch used to process the U component, all the information of the U component can be integrated together, and then filtered through the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0270] Method 12 is used to process the fifth input branch of the V component.

[0271] Optionally, the second input branch includes the fifth input branch.

[0272] Optionally, in the fourth input branch, the V component of the image patch can be processed by concatenating each channel. For example, the first feature of the derived patch on the V component and the second feature of the reconstructed patch and / or predicted patch on the V component can be concatenated by channel, and then the input of the neural network and / or lookup table can be determined or obtained based on the result of the channel concatenation.

[0273] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the fourth input branch used to process the V component, all the information of the V component can be integrated together, and then filtered through the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0274] Method 3 is the third input branch used for mixing the luminance and chrominance components.

[0275] Optionally, in the third input branch, the luminance and chrominance components of the image patch can be processed by concatenating each channel. For example, the first features of the derived patch in the luminance and chrominance components, and the second features of the reconstructed patch and / or predicted patch in the luminance and chrominance components can be concatenated by channel, and then the input of the neural network and / or lookup table can be determined or obtained based on the result of the channel concatenation.

[0276] In this embodiment, by determining or obtaining the input of the neural network and / or lookup table based on the third input branch used for mixing the luminance and chrominance components, all information of the luminance and chrominance components can be integrated together, and then filtered through the neural network and / or lookup table, thus ensuring the effectiveness of the filtering process.

[0277] Optionally, in step S11, channel stitching is performed based on multiple input branches and at least one image patch feature, including at least one of steps b1 to b15:

[0278] Step b1: Based on the first input branch, perform channel splicing on the predicted transformation features of at least one luminance component, the reconstructed transformation features of at least one luminance component, and the derived transformation features of at least one luminance component.

[0279] Optionally, the individual features of the luminance component can be processed in the first input branch.

[0280] Optionally, in the first input branch, the prediction transformation features of at least one prediction block on the luminance component, the reconstruction transformation features of at least one reconstruction block on the luminance component, the derivative transformation features of the derivative blocks of at least one reconstruction block on the luminance component, and the derivative transformation features of the derivative blocks of at least one prediction block on the luminance component can be determined.

[0281] Channel concatenation can be performed on at least one predicted transform feature, at least one reconstructed transform feature, at least one derived transform feature corresponding to a derived block of a reconstructed block, and at least one derived transform feature corresponding to a derived block of a predicted block to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0282] For example, such as Figure 9 As shown, taking the luminance component as an example, in the first input branch, the prediction block (PredY), the derivative block (PredY Overlap), the reconstruction block (RecY), and the derivative block (RecY Overlap) of the reconstruction block, all with a size of 144x144x1, are all subjected to 2x2 DCT transformation (i.e., DCT-II in the figure) and reconstructed into a 72x72x4 tensor. Then, the channels are stitched together through the Rec module to obtain a 72x72x16 tensor (containing multi-channel features). This tensor is then input into the neural network for model training, such as the Conv3x3, 16 convolutional layer performing grouped convolution processing to achieve feature fusion through grouped convolution processing. Finally, the filtered image patch is output.

[0283] In this embodiment, by performing channel concatenation on the predicted transform feature, the reconstructed transform feature, and the derived transform feature of at least one luminance component based on the first input branch, at least one input feature is determined or obtained. Then, the channel concatenation result is processed by a neural network and / or a lookup table, which can effectively capture the transition features between image blocks on the luminance component, improve the filtering effect of the filtering process, and thus improve the effect of video encoding and / or decoding.

[0284] Step b2: Based on the second input branch, perform channel splicing on the predicted transformation features of at least one chromaticity component, the reconstructed transformation features of at least one chromaticity component, and the derived transformation features of at least one luminance component.

[0285] Optionally, the individual features of the chroma components can be processed in the second input branch.

[0286] Optionally, in the second input branch, the prediction transformation features of at least one prediction block on the chromaticity component, the reconstruction transformation features of at least one reconstruction block on the chromaticity component, the derivative transformation features of the derivative blocks of at least one reconstruction block on the chromaticity component, and the derivative transformation features of the derivative blocks of at least one prediction block on the chromaticity component can be determined.

[0287] Optionally, at least one predicted transform feature, at least one reconstructed transform feature, at least one derived transform feature corresponding to the derived block of the reconstructed block, and at least one derived transform feature corresponding to the derived block of the predicted block can be concatenated to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0288] In this embodiment, by performing channel concatenation on the predicted transform feature of at least one chroma component, the reconstructed transform feature of at least one chroma component, and the derived transform feature of at least one luminance component according to the second input branch, at least one input feature is determined or obtained. Then, the channel concatenation result is processed by a neural network and / or a lookup table, which can effectively capture the transition features between image blocks on the chroma component, improve the filtering effect of the filtering process, and thus improve the effect of video encoding and / or decoding.

[0289] Step b3: Based on the third input branch, perform channel splicing on at least one of the following: the predicted transformation feature of at least one luminance component and / or chrominance component, the derived transformation feature of at least one luminance component and / or chrominance component, and the reconstructed transformation feature of at least one luminance component and / or chrominance component.

[0290] Optionally, the chromaticity components and / or individual features of the chromaticity components can be processed in the third input branch.

[0291] Optionally, in the second input branch, the prediction transformation features of at least one prediction block on the chromaticity component, the prediction transformation features of at least one prediction block on the luminance component, the reconstruction transformation features of at least one reconstruction block on the chromaticity component, the reconstruction transformation features of at least one reconstruction block on the luminance component, the derivative transformation features of the derivative blocks of at least one reconstruction block on the chromaticity component, the derivative transformation features of the derivative blocks of at least one reconstruction block on the luminance component, the derivative transformation features of the derivative blocks of at least one prediction block on the chromaticity component, and the derivative transformation features of the derivative blocks of at least one prediction block on the luminance component can be determined.

[0292] Optionally, at least one of the following can be concatenated to determine or obtain at least one input feature: the predicted transform feature of at least one luminance component and / or chrominance component, the derived transform feature of at least one luminance component and / or chrominance component (including the derived transform feature corresponding to the reconstruction block and / or the derived transform feature corresponding to the prediction block), and the reconstructed transform feature of at least one luminance component and / or chrominance component. Then, the at least one input feature can be filtered by a neural network and / or a lookup table.

[0293] In this embodiment, by performing channel concatenation on at least one of the following based on the third input branch: the predicted transform feature of at least one luminance component and / or chrominance component, the derived transform feature of at least one luminance component and / or chrominance component, and the reconstructed transform feature of at least one luminance component and / or chrominance component, at least one input feature is determined or obtained. Then, the channel concatenation result is processed by a neural network and / or a lookup table. This can effectively capture the transition features between image blocks on the chrominance and luminance components, improve the filtering effect of the filtering process, and thus improve the effect of video encoding and / or decoding.

[0294] Step b4: Channel splicing is performed based on the reconstructed transformation features of at least one luminance component and the derived transformation features of at least one luminance component from the first sub-input branch.

[0295] Optionally, in the first sub-input branch, the features of the reconstructed block and the corresponding derived block on the luminance component can be processed.

[0296] Optionally, the reconstruction transformation features of the reconstruction block in the first sub-input branch in the luminance component and the derivative transformation features of the derivative block of the reconstruction block in the luminance component can be determined.

[0297] Optionally, at least one reconstructed transform feature and at least one derived transform feature can be concatenated in the first sub-input branch to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0298] In this embodiment, by performing channel concatenation on the reconstructed transformation features of at least one luminance component and the derived transformation features of at least one luminance component based on the first sub-input branch, at least one input feature is determined or obtained. Then, the channel concatenation result is processed by a neural network and / or a lookup table, which can effectively capture the transition features between reconstructed blocks on the luminance component, improve the filtering effect of the filtering process, and thus improve the effect of video encoding and / or decoding.

[0299] Step b5: Channel splicing is performed based on the predicted transformation features of at least one luminance component and the derived transformation features of at least one luminance component from the second sub-input branch.

[0300] Optionally, in the second sub-input branch, the features of the prediction block and the corresponding derived block on the brightness component can be processed.

[0301] Optionally, the prediction transformation features of the prediction block in the second sub-input branch in the luminance component and the derivative transformation features of the derivative block of the prediction block in the luminance component can be determined.

[0302] Optionally, at least one predicted transform feature and at least one derived transform feature can be concatenated in the second sub-input branch to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0303] In this embodiment, by performing channel concatenation on the predicted transform features of at least one luminance component and the derived transform features of at least one luminance component based on the second sub-input branch, at least one input feature is determined or obtained. Then, the channel concatenation result is processed by a neural network and / or a lookup table, which can effectively capture the transition features between prediction blocks on the luminance component, improve the filtering effect of the filtering process, and thus improve the effect of video encoding and / or decoding.

[0304] Step b6: Perform channel stitching based on the reconstruction transformation features of at least one luminance component from the third sub-input branch;

[0305] Optionally, the features of the reconstructed block on the luminance component can be processed in the third sub-input branch.

[0306] Optionally, the reconstruction transformation characteristics of the reconstruction block in the third sub-input branch across all channels of the luminance component can be determined.

[0307] Optionally, in the third sub-input branch, the reconstructed transformation features of at least two channels can be concatenated to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0308] In this embodiment, by performing channel splicing on the reconstruction transformation features of at least one luminance component based on the third sub-input branch to determine or obtain at least one input feature, and then processing the channel splicing result through a neural network and / or lookup table, it is possible to achieve unified multi-channel processing of the reconstruction transformation features of the reconstruction block on the luminance component, thereby improving the filtering effect of the filtering process.

[0309] Step b7: Perform channel splicing based on the predicted transformation features of at least one luminance component from the fourth sub-input branch.

[0310] Optionally, the features of the prediction block on the luminance component can be processed in the fourth sub-input branch.

[0311] Optionally, the prediction transform characteristics of the prediction block in the fourth sub-input branch across all channels of the luminance component can be determined.

[0312] Optionally, in the fourth sub-input branch, the predicted transform features of at least two channels can be concatenated to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0313] In this embodiment, by performing channel splicing on the prediction transformation features of at least one luminance component based on the fourth sub-input branch to determine or obtain at least one input feature, and then processing the channel splicing result through a neural network and / or lookup table, it is possible to achieve unified multi-channel processing of the prediction transformation features of the prediction block on the luminance component, thereby improving the filtering effect of the filtering process.

[0314] Step b8: Perform channel splicing based on the derived transformation features of at least one luminance component according to the fifth sub-input branch;

[0315] Optionally, in the fifth sub-input branch, the individual features of the derived blocks of the predicted block and / or the derived blocks of the reconstructed block on the luminance component can be processed.

[0316] Optionally, the derivative transformation characteristics of the derivative block in the fifth sub-input branch across all channels of the luminance component can be determined.

[0317] Optionally, in the fifth sub-input branch, the derived transformation features of at least two channels can be concatenated to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0318] In this embodiment, by performing channel splicing on the derived transformation features of at least one luminance component based on the fifth sub-input branch, and then processing the channel splicing results through a neural network and / or lookup table, it is possible to achieve unified multi-channel processing of the derived transformation features of the derived block on the luminance component, thereby improving the filtering effect of the filtering process.

[0319] Step b9: Channel splicing is performed based on the reconstruction transformation features of at least one chromaticity component and the derived transformation features of at least one chromaticity component from the sixth sub-input branch.

[0320] Optionally, in the sixth sub-input branch, the features of the reconstructed block and the corresponding derived block in terms of chroma components can be processed.

[0321] Optionally, the reconstruction transformation characteristics of the reconstruction block in the sixth sub-input branch in the chroma component and the derivative transformation characteristics of the derivative block of the reconstruction block in the chroma component can be determined.

[0322] Optionally, at least one reconstructed transform feature and at least one derived transform feature can be concatenated in the sixth sub-input branch to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0323] In this embodiment, by performing channel splicing on the reconstructed transformation features of at least one chroma component and the derived transformation features of at least one chroma component based on the sixth sub-input branch, and then processing the channel splicing results through a neural network and / or lookup table, it is possible to effectively capture the transition features between reconstructed blocks on the chroma component, thereby improving the filtering effect of the filtering process and thus improving the effect of video encoding and / or decoding.

[0324] Step b10: Channel splicing is performed based on the predicted transform features of at least one chromaticity component and the derived transform features of at least one chromaticity component from the seventh sub-input branch.

[0325] Optionally, in the seventh sub-input branch, the features of the prediction block and the corresponding derived block on the chromaticity component can be processed.

[0326] Optionally, the prediction transform features of the prediction block in the seventh sub-input branch in the chromaticity component and the derivative transform features of the derivative block of the prediction block in the chromaticity component can be determined.

[0327] Optionally, at least one predicted transform feature and at least one derived transform feature can be concatenated in the seventh sub-input branch to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0328] In this embodiment, by performing channel concatenation based on the predicted transform features of at least one chroma component and the derived transform features of at least one chroma component according to the seventh sub-input branch, at least one input feature is determined or obtained. Then, the channel concatenation result is processed by a neural network and / or a lookup table, which can effectively capture the transition features between prediction blocks on the chroma component, improve the filtering effect of the filtering process, and thus improve the effect of video encoding and / or decoding.

[0329] Step b11: Channel splicing is performed on the reconstruction transformation features of at least one chroma component based on the eighth sub-input branch;

[0330] Optionally, the features of the reconstructed block on the chroma component can be processed in the eighth sub-input branch.

[0331] Optionally, the reconstruction transformation characteristics of the reconstruction block in the eighth sub-input branch across all channels of the chroma component can be determined.

[0332] Optionally, in the eighth sub-input branch, the reconstructed transformation features of at least two channels can be concatenated to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0333] In this embodiment, at least one input feature is determined or obtained by concatenating the reconstructed transformation features of at least one chroma component according to the eighth sub-input branch. Then, the channel concatenation result is processed by a neural network and / or a lookup table, thereby realizing unified multi-channel processing of the reconstructed transformation features of the reconstructed block on the chroma component, thereby improving the filtering effect of the filtering process.

[0334] Step b12: Channel splicing is performed based on the predicted transform features of at least one chromaticity component according to the ninth sub-input branch.

[0335] Optionally, the features of the prediction block on the chromaticity component can be processed in the ninth sub-input branch.

[0336] Optionally, the prediction transform characteristics of the prediction block in the ninth sub-input branch across all channels of the chroma component can be determined.

[0337] Optionally, in the ninth sub-input branch, the predicted transform features of at least two channels can be concatenated to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0338] In this embodiment, by performing channel splicing on the prediction transformation features of at least one chroma component based on the ninth sub-input branch to determine or obtain at least one input feature, and then processing the channel splicing results through a neural network and / or lookup table, it is possible to achieve unified multi-channel processing of the prediction transformation features of the prediction block on the chroma component, thereby improving the filtering effect of the filtering process.

[0339] Step b13: Perform channel splicing based on the derived transformation features of at least one chroma component according to the tenth sub-input branch;

[0340] Optionally, in the tenth sub-input branch, the individual features of the derived blocks of the predicted block and / or the derived blocks of the reconstructed block on the chromaticity component can be processed.

[0341] Optionally, the derivative transform characteristics of the derivative block in the tenth sub-input branch across all channels of the chroma component can be determined.

[0342] Optionally, in the tenth sub-input branch, the derived transform features of at least two channels can be concatenated to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0343] In this embodiment, by performing channel splicing on the derived transformation features of at least one chroma component according to the tenth sub-input branch, and then processing the channel splicing results through a neural network and / or lookup table, it is possible to achieve unified multi-channel processing of the derived transformation features of the derived block on the luminance component, thereby improving the filtering effect of the filtering process.

[0344] Step b14: Based on the fourth input branch, perform channel splicing on the predicted transformation features of at least one U component, the reconstructed transformation features of at least one U component, and the derived transformation features of at least one U component.

[0345] Optionally, the features of the U component can be processed in the fourth input branch.

[0346] Optionally, in the fourth input branch, the prediction transformation features of at least one prediction block on the U component, the reconstruction transformation features of at least one reconstruction block on the U component, the derivative transformation features of the derivative blocks of at least one reconstruction block on the U component, and the derivative transformation features of the derivative blocks of at least one prediction block on the U component can be determined.

[0347] Optionally, at least one predicted transform feature, at least one reconstructed transform feature, at least one derived transform feature corresponding to the derived block of the reconstructed block, and at least one derived transform feature corresponding to the derived block of the predicted block can be concatenated to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0348] In this embodiment, by channel concatenating the predicted transform feature, the reconstructed transform feature, and the derived transform feature of at least one U component according to the fourth input branch, at least one input feature is determined or obtained. Then, the channel concatenation result is processed by a neural network and / or a lookup table, which can effectively capture the transition features between image blocks on the U component, improve the filtering effect of the filtering process, and thus improve the effect of video encoding and / or decoding.

[0349] Step b15: Based on the fifth input branch, perform channel splicing on the predicted transform features of at least one V component, the reconstructed transform features of at least one V component, and the derived transform features of at least one V component.

[0350] Optionally, the individual features of the V component can be processed in the fourth input branch.

[0351] Optionally, in the fourth input branch, the prediction transformation features of at least one prediction block on the V component, the reconstruction transformation features of at least one reconstruction block on the V component, the derivative transformation features of the derivative blocks of at least one reconstruction block on the V component, and the derivative transformation features of the derivative blocks of at least one prediction block on the V component can be determined.

[0352] Optionally, at least one predicted transform feature, at least one reconstructed transform feature, at least one derived transform feature corresponding to the derived block of the reconstructed block, and at least one derived transform feature corresponding to the derived block of the predicted block can be concatenated to determine or obtain at least one input feature, and then the at least one input feature can be filtered by a neural network and / or a lookup table.

[0353] In this embodiment, by channel concatenating the predicted transform feature, the reconstructed transform feature, and the derived transform feature of at least one V component according to the fifth input branch, at least one input feature is determined or obtained. Then, the channel concatenation result is processed by a neural network and / or a lookup table, which can effectively capture the transition features between image blocks on the V component, improve the filtering effect of the filtering process, and thus improve the effect of video encoding and / or decoding.

[0354] Step S12: Filter at least one input feature based on the neural network and / or lookup table.

[0355] Optionally, at least one index can be determined or obtained based on at least one input feature, and the at least one index can be input into a lookup table for searching. The filtered image patch can be determined or obtained based on the search result. The correspondence between each feature and the index can be set in advance, and the index corresponding to at least one input feature can be determined based on the correspondence.

[0356] Alternatively, at least one input feature can be fed into the neural network for filtering.

[0357] In this embodiment, by filtering at least one input feature based on a neural network and / or a lookup table, the advantages of the neural network and / or lookup table can be combined, and the transition features between image blocks can be effectively captured through derived blocks, reconstructed blocks and / or prediction blocks, thereby improving the filtering effect of the filtering process and thus improving the effect of video encoding and / or decoding.

[0358] Third Embodiment

[0359] Based on any of the above embodiments, a third embodiment is proposed.

[0360] In this embodiment, the neural network includes a grouped convolutional module for dividing at least one input feature into at least one group for convolutional processing, and / or, at least one group includes at least one of modes four to nineteen.

[0361] Optionally, since the neural network contains a grouped convolution module, at least one input feature can be input into the neural network, and then the grouped convolution module in the neural network can be used to group the at least one input feature. Alternatively, the grouping can be performed according to a one-to-one correspondence with the input branches, or according to the number of channels. No restrictions are imposed here.

[0362] Optionally, convolutional modules can be used for convolutional processing within each group, or partial convolutional modules can be used for partial convolutional processing; there are no restrictions on this.

[0363] Method four is used to process at least one set of features corresponding to the first input branch in at least one input feature;

[0364] Optionally, in the neural network, at least one set corresponding to the first input branch can be determined, and within the at least one set corresponding to the first input branch, the reconstruction transformation features of the reconstruction block in the luminance component, the derivative transformation features of the derivative block of the reconstruction block in the luminance component, the prediction transformation features of the prediction block in the luminance component, and the derivative transformation features of the derivative block of the prediction block in the luminance component are convolved by the convolution module.

[0365] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the first input branch in processing at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0366] Method 5 is used to process at least one set of features corresponding to the second input branch in at least one input feature;

[0367] Optionally, in the neural network, at least one set corresponding to the second input branch can be determined, and within the at least one set corresponding to the second input branch, the reconstruction transformation features of the reconstruction block on the chroma component, the derivative transformation features of the derivative block of the reconstruction block on the chroma component, the prediction transformation features of the prediction block on the chroma component, and the derivative transformation features of the derivative block of the prediction block on the chroma component are convolved by the convolution module.

[0368] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the second input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0369] Method six is ​​used to process at least one set of features corresponding to the third input branch in at least one input feature;

[0370] Optionally, in the neural network, at least one set corresponding to the third input branch can be determined, and within the at least one set corresponding to the third input branch, the reconstruction transformation features of the reconstructed block in the chroma component, the derivative transformation features of the reconstructed block in the chroma component, the prediction transformation features of the prediction block in the chroma component, the derivative transformation features of the prediction block in the chroma component, the reconstruction transformation features of the reconstructed block in the luminance component, the derivative transformation features of the reconstructed block in the luminance component, the prediction transformation features of the prediction block in the luminance component, and the derivative transformation features of the prediction block in the luminance component are convolved by a convolution module.

[0371] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the third input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0372] Method 7 is used to process at least one set of features corresponding to the first sub-input branch of at least one input feature;

[0373] Optionally, in the neural network, at least one set corresponding to the first sub-input branch can be determined, and within the at least one set corresponding to the first sub-input branch, the reconstruction transformation features of the reconstructed block on the luminance component and the derived transformation features of the derived block on the luminance component are convolved by a convolution module. And / or convolution processing can also be performed by a partial convolution module, that is, only a portion of the reconstruction transformation features and / or derived transformation features are convolved.

[0374] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the first sub-input branch in processing at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0375] Method 8 is used to process at least one set of features corresponding to the second sub-input branch of at least one input feature;

[0376] Optionally, in the neural network, at least one set corresponding to the second sub-input branch can be determined, and within the at least one set corresponding to the second sub-input branch, the prediction transformation features of the prediction block in the luminance component and the derived transformation features of the prediction block in the luminance component are convolved by a convolution module. And / or convolution processing can also be performed by a partial convolution module, that is, only a portion of the prediction transformation features and / or derived transformation features are convolved.

[0377] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the second sub-input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0378] Method nine is used to process at least one set of features corresponding to the third sub-input branch in at least one input feature;

[0379] Optionally, in the neural network, at least one set corresponding to the third sub-input branch can be determined, and within the at least one set corresponding to the third sub-input branch, the reconstruction transformation features of the reconstruction block on the luminance component are convolved by a convolution module. And / or convolution processing can also be performed by a partial convolution module, that is, only a portion of the predicted transformation features and / or derived transformation features are convolved.

[0380] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the third sub-input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0381] Method 10 is used to process at least one set of features corresponding to the fourth sub-input branch in at least one input feature;

[0382] Optionally, in the neural network, at least one set corresponding to the fourth sub-input branch can be determined, and within the at least one set corresponding to the fourth sub-input branch, the predicted transformation features of the prediction block on the luminance component are convolved by a convolution module. And / or convolution processing can also be performed by a partial convolution module, that is, only a portion of the predicted transformation features and / or derived transformation features are convolved.

[0383] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the fourth sub-input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0384] Method 11 is used to process at least one set of features corresponding to the fifth sub-input branch in at least one input feature;

[0385] Optionally, in the neural network, at least one set corresponding to the fifth sub-input branch can be determined, and within the at least one set corresponding to the fifth sub-input branch, convolutional processing is performed on the derived transformation features of the predicted block's derived blocks in the luminance component, and / or the derived transformation features of the reconstructed block's derived blocks in the luminance component, using convolutional modules. Alternatively, convolutional processing can be performed using partial convolutional modules, i.e., only partial prediction transformation features and / or derived transformation features are convolved.

[0386] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the fifth sub-input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0387] Method 12 is used to process at least one set of features corresponding to the sixth sub-input branch in at least one input feature;

[0388] Optionally, in the neural network, at least one set corresponding to the sixth sub-input branch can be determined, and within the at least one set corresponding to the sixth sub-input branch, the reconstruction transformation features of the reconstructed block on the chroma component and the derived transformation features of the derived block on the chroma component are convolved by a convolutional module. And / or convolution processing can also be performed by a partial convolutional module, that is, only a portion of the reconstruction transformation features and / or derived transformation features are convolved.

[0389] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the sixth sub-input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0390] Method 13 is used to process at least one set of features corresponding to the seventh sub-input branch in at least one input feature;

[0391] Optionally, in the neural network, at least one set corresponding to the seventh sub-input branch can be determined, and within the at least one set corresponding to the seventh sub-input branch, the prediction transformation features of the prediction block on the chroma component and the derived transformation features of the prediction block on the chroma component are convolved by a convolution module. And / or convolution processing can also be performed by a partial convolution module, that is, only a portion of the prediction transformation features and / or derived transformation features are convolved.

[0392] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the seventh sub-input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0393] Method fourteen is used to process at least one set of features corresponding to the eighth sub-input branch in at least one input feature;

[0394] Optionally, in the neural network, at least one set corresponding to the eighth sub-input branch can be determined, and within the at least one set corresponding to the eighth sub-input branch, the reconstruction transformation features of the reconstruction block on the chroma component are convolved by a convolutional module. And / or convolution processing can also be performed by a partial convolutional module, that is, only a portion of the predicted transformation features and / or derived transformation features are convolved.

[0395] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the eighth sub-input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0396] Method 15 is used to process at least one set of features corresponding to the ninth sub-input branch in at least one input feature;

[0397] Optionally, in the neural network, at least one set corresponding to the ninth sub-input branch can be determined, and within the at least one set corresponding to the ninth sub-input branch, the predicted transformation features of the prediction block on the chroma component are convolved using a convolution module. And / or convolution processing can also be performed using a partial convolution module, i.e., only a portion of the predicted transformation features and / or derived transformation features are convolved.

[0398] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the ninth sub-input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0399] Method sixteen is used to process at least one set of features corresponding to the tenth sub-input branch in at least one input feature;

[0400] Optionally, in the neural network, at least one set corresponding to the tenth sub-input branch can be determined, and within the at least one set corresponding to the tenth sub-input branch, convolutional processing is performed on the derivative transformation features of the predicted block's derivative blocks in the chroma component, and / or the derivative transformation features of the reconstructed block's derivative blocks in the chroma component, using convolutional modules. Alternatively, convolutional processing can be performed using partial convolutional modules, i.e., only partial prediction transformation features and / or derivative transformation features are convolved.

[0401] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the tenth sub-input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0402] Method 17 is used to process at least one set of features corresponding to the fourth input branch in at least one input feature;

[0403] Optionally, in the neural network, at least one set corresponding to the fourth input branch can be determined, and within the at least one set corresponding to the fourth input branch, convolutional processing is performed on the reconstruction transformation features of the reconstructed block on the U component, the derivative transformation features of the reconstructed block on the U component, the prediction transformation features of the prediction block on the U component, and the derivative transformation features of the prediction block on the U component through convolutional modules. And / or convolutional processing can also be performed through partial convolutional modules, i.e., only partial prediction transformation features and / or derivative transformation features are convolved.

[0404] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the fourth input branch in at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0405] Method 18 is used to process at least one set of features corresponding to the fifth input branch in at least one input feature;

[0406] Optionally, in the neural network, at least one set corresponding to the fifth input branch can be determined, and within the at least one set corresponding to the fifth input branch, convolutional processing is performed on the reconstruction transformation features of the reconstructed block on the V component, the derivative transformation features of the reconstructed block on the V component, the prediction transformation features of the prediction block on the V component, and the derivative transformation features of the prediction block on the V component through convolutional modules. And / or convolutional processing can also be performed through partial convolutional modules, that is, only partial prediction transformation features and / or derivative transformation features are convolved.

[0407] In this embodiment, by performing grouped convolution processing in the neural network and using at least one set of features corresponding to the fifth input branch in processing at least one input feature for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature expressive power.

[0408] Method 19 includes at least one set of at least one channel from multiple channels in each input branch.

[0409] Optionally, the input branch is at least one of the multiple input branches corresponding to at least one input feature.

[0410] Alternatively, the images can be grouped across channels, for example, by selecting at least one channel of image features as a separate group for convolution processing in each input branch of a multi-input branch.

[0411] Optionally, in each of the multiple input branches, one channel-dimensional input feature (such as predicted transform feature, reconstructed transform feature, derived transform feature, etc.) can be selected into a group for convolution processing until the input features of each channel dimension in each input branch have completed the corresponding convolution processing in their respective groups.

[0412] For example, if a multi-input branch includes a first input branch and a second input branch, and the first input branch contains input feature 1 with a channel dimension of 1, input feature 2 with a channel dimension of 2, and input feature 3 with a channel dimension of 3, and the second input branch contains input feature 4 with a channel dimension of 4, input feature 5 with a channel dimension of 5, and input feature 6 with a channel dimension of 6, then in the neural network, a grouped convolution module can be used to perform grouped convolution processing on each input feature. For example, the first group processes input feature 1 with a channel dimension of 1 and input feature 4 with a channel dimension of 4, the second group processes input feature 2 with a channel dimension of 2 and input feature 5 with a channel dimension of 5, and the third group processes input feature 3 with a channel dimension of 3 and input feature 6 with a channel dimension of 6.

[0413] In this embodiment, by performing grouped convolution processing in the neural network and using at least one group of at least one channel from multiple channels containing each input branch for convolution processing, the filtering progress is ensured to be effective, thereby effectively reducing the amount of computation while maintaining the feature representation capability.

[0414] Fourth embodiment

[0415] Based on any of the above embodiments, a fourth embodiment is proposed.

[0416] In this embodiment, the image processing method further includes at least one of the following methods 20 to 29:

[0417] Method 20: The grouped convolutional module includes at least one convolutional module corresponding to a group, and the convolutional modules corresponding to at least one group are independent of each other;

[0418] Optionally, in the neural network, the group convolution module includes at least one convolution module corresponding to a group. That is, some groups may have convolution modules, while others may not. Then, the outputs of all groups are concatenated and further processed until the neural network outputs the corresponding filtered target image block.

[0419] Optionally, the convolutional modules corresponding to at least one group are independent of each other. There may be multiple groups, each applying its own corresponding convolutional module to perform the corresponding convolutional processing, and / or the convolutional modules in multiple groups may perform convolutional processing in parallel.

[0420] In this embodiment, the group convolution module of the neural network includes at least one group-specific convolution module. The at least one group-specific convolution module is independent of each other, which can ensure the effective performance of group convolution.

[0421] Method 21: At least two groups must have the same number of input channels;

[0422] Optionally, in a neural network, the number of input channels corresponding to the input features input to some or all groups is equal, that is, some or all groups process different input features with the same number of channels.

[0423] In this embodiment, at least two groups in the neural network contain the same number of input channels, which enables effective grouped convolution.

[0424] Method 22: At least two groups have the same convolutional layer parameters;

[0425] Optionally, in a neural network, the convolutional layer parameters of some group-intra-group convolutional modules can be the same.

[0426] Optionally, the convolutional layer parameters of some groups of convolutional modules can be different.

[0427] In this embodiment, at least two groups of convolutional layers in the neural network have the same parameters, which enables effective grouped convolution.

[0428] Method 23: The grouped convolutional module sequentially includes a predecessor layer, a first channel shuffling layer, and a subsequent convolutional module for each group.

[0429] Method 24: The input of the first channel shuffling layer is the output of at least two groups of precursor layers, and the output of the first channel shuffling layer is the input of at least two subsequent groups of convolutional modules.

[0430] Method 25: The precursor layer includes at least one convolutional module for convolutional processing of at least one set of features;

[0431] Method 26: The first channel shuffling layer is used to perform cross-group permutation of the output feature map of the predecessor layer in the channel dimension through the first channel rearrangement rule;

[0432] Optionally, in the neural network, multiple convolutional modules and a first channel shuffling layer can be set in each group. The first channel shuffling layer can be set in the middle of multiple convolutional modules, and the convolutional modules located before the first channel shuffling layer in the same group can be used as the predecessor layer, and the convolutional modules located after the first channel shuffling layer can be used as the successor convolutional modules.

[0433] Optionally, the convolution module may include at least one convolutional layer, such as a 1x1 convolutional layer.

[0434] Optionally, the precursor layer may include a 1x1 convolutional layer, or a smaller lookup table may be used to replace at least one convolutional module included in the precursor layer. For example, the index corresponding to at least one set of features may be input into the lookup table for searching, and then cross-group permutation in the channel dimension may be performed through the first channel shuffling layer based on the search result.

[0435] Optionally, within at least one group, at least one convolutional module in the precursor layer performs deep feature fusion and transformation on all features within the at least one group to obtain higher quality features, which are then processed through the first channel shuffling layer.

[0436] Optionally, the precursor layer can deeply fuse intra-group features, deeply fuse and refine all channel information within the same group, generate high-quality intra-group feature representations, and perform dimensionality reduction processing on the generated high-quality features to reduce the computational load of the first channel shuffling layer.

[0437] Optionally, the first channel washing layer is interspersed in each group.

[0438] Optionally, the first channel shuffling layer can rearrange the channel order of the input feature map to promote information flow and fusion between different channels. It can reshape the channel dimension of the input feature map into two dimensions, such as the number of convolutional groups and the number of channels in each convolutional group. Then, the two reshaped dimensions are transposed, and the number of convolutional groups (such as the number of multiple groups) and the number of channels in each convolutional group (such as each group) are swapped. Finally, the transposed channel dimension is flattened to restore the original number of channels. This can improve the expressive power of features without increasing the computational cost.

[0439] Optionally, for any group, at least one input feature can be input into the precursor layer for convolution processing, and then the output of the precursor layer can be input into the first channel shuffling layer for cross-group permutation operation in the channel dimension. Then, the output of the first channel shuffling layer can be input into the subsequent convolution module for convolution processing until the neural network outputs the filtered target image block.

[0440] Optionally, in the first channel shuffling layer, the output of the predecessor layer of each group can be permuted across groups in the channel dimension according to the first channel rearrangement rule. For example, the feature corresponding to the first channel of the first group and the feature corresponding to the first channel of the second group can be permuted across groups.

[0441] Optionally, the output feature map of the precursor layer includes the output results of the precursor layer.

[0442] Optionally, the first channel rearrangement rule can be a pre-set rearrangement rule for the channel dimensions, and there are no restrictions on this.

[0443] In this embodiment, each group in the neural network is sequentially configured with a precursor layer containing at least one convolutional module, a first channel shuffling layer, and a subsequent convolutional module for each group, and / or during group convolution processing, the network sequentially passes through the precursor layer, the first channel shuffling layer, and the subsequent convolutional module for corresponding processing, and / or the first channel shuffling layer can perform cross-group permutation of the output feature map of the precursor layer in the channel dimension through the first channel rearrangement rule, which can improve the filtering effect of filtering at least one image block.

[0444] Method 27: The grouped convolutional module includes a second channel shuffling layer and a convolutional module for each group, and the input of the convolutional module for each group is the output of the second channel shuffling layer.

[0445] Method 28: The second channel shuffling layer is used to perform cross-group permutation of the input feature map, which includes at least two groups of features, in the channel dimension by means of the second channel rearrangement rule;

[0446] Optionally, in the neural network, at least one convolutional module and a second channel shuffling layer can be set in each group. The second channel shuffling layer can be placed before at least one convolutional module, and the second channel shuffling layer can be interspersed in each group. Optionally, the second channel shuffling layer can refer to the first channel shuffling layer described above, and will not be repeated here.

[0447] Optionally, for any group, at least one input feature of that group can be input into the second channel shuffling layer. Within the second channel shuffling layer, the input feature map, which includes at least one group of features, is permuted across groups along the channel dimension using the second channel rearrangement rule. Then, the output of the second channel shuffling layer is input into the subsequent convolutional module of each group for convolution processing until the neural network outputs the filtered target image patch.

[0448] Optionally, each group corresponds to an input feature map, which includes input features input to at least one group.

[0449] Optionally, the rearrangement rules for the second channel can be the same as or different from those for the first channel; no restrictions are placed here.

[0450] Optionally, the second channel rearrangement rule can be a pre-set rearrangement rule for the channel dimensions.

[0451] In this embodiment, the neural network sequentially includes a second channel shuffling layer and a convolutional module for each group. The second channel shuffling layer can perform cross-group permutation of the input feature map, which includes at least one group of features, in the channel dimension through the second channel rearrangement rule, which can improve the filtering effect of filtering at least one image block.

[0452] Method 29: The grouped convolutional module sequentially includes a second channel shuffling layer, a precursor layer for each group, a first channel shuffling layer, and a subsequent convolutional module for each group.

[0453] Optionally, in the neural network, a second channel shuffling layer, a precursor layer, a first channel shuffling layer, and a subsequent convolutional module can be set sequentially within each group.

[0454] Optionally, for any group, at least one input feature of that group can be input into the second channel shuffling layer. Within the second channel shuffling layer, the input feature map (including at least one input feature) comprising at least one group of features is permuted across channels using the second channel rearrangement rule. Then, the output of the second channel shuffling layer is input into the preceding layer of each group for convolution processing. The output of the preceding layer (i.e., the output feature map) is then input into the first channel shuffling layer, where the output feature map of the preceding layer is permuted across channels using the first channel rearrangement rule. Finally, the output of the first channel shuffling layer is input into subsequent convolutional modules for convolution processing until the neural network outputs the filtered target image patch.

[0455] In this embodiment, the neural network sequentially includes a second channel shuffling layer, a precursor layer for each group, a first channel shuffling layer, a subsequent convolutional module for each group, and / or can perform cross-group permutation operations multiple times in the channel dimension through the first channel shuffling layer and the second channel shuffling layer to improve the filtering effect of filtering at least one image block.

[0456] Fifth embodiment

[0457] Based on any of the above embodiments, a fifth embodiment is proposed.

[0458] In this embodiment, step S12 includes at least one of steps c1 to c5:

[0459] Step c1: Determine or obtain at least one first intermediate value based on the neural network and at least one input feature, and perform filtering processing on at least one image block based on the at least one first intermediate value and at least one lookup table;

[0460] Optionally, the first intermediate value can be data generated during an intermediate processing step in the filtering process, such as a filtered pixel generated after at least one filtering process, or other values. Optionally, the intermediate processing step can be a process of performing pre-filtering processing on the pixel to be filtered (such as filtering through a neural network or lookup table).

[0461] Optionally, at least one input feature can be input into the neural network for filtering processing, and the output can be a first intermediate value. For example, at least one of the following can be input into the neural network for filtering processing: quantization parameters, boundary strength, representation interval, position information of the pixel to be filtered, pixel value to be filtered, size information of image block, filtering information of neighboring blocks, filtering information of non-neighboring blocks, filtering information of cross-component blocks, filtering information of co-position blocks, filtering information of temporal blocks, filtering information of default blocks, and filtering information of candidate blocks, and the output can be a first intermediate value.

[0462] Optionally, the first intermediate value can be directly used as the filtered pixel, and the target image block can be determined or generated based on the filtered pixel.

[0463] Optionally, the first intermediate value can be processed to determine or generate a target image block. For example, the first intermediate value can be input into at least one lookup table to determine the filtered pixel, and the target image block can be determined or generated based on the at least one filtered pixel. Optionally, the lookup table may include the correspondence between the first intermediate value and the filtered pixel.

[0464] Optionally, at least one lookup table can be a multi-level lookup table. The first intermediate value can be converted into an index input to the multi-level lookup table for parallel or serial lookup, the output is the filtered pixel, and the target image patch is determined or generated based on at least one filtered pixel.

[0465] In this embodiment, by determining or generating at least one first intermediate value based on a neural network and at least one input feature, and by determining or generating a target image block based on the at least one first intermediate value and at least one lookup table, the filtering effect can be improved by using the neural network and the lookup table together for filtering processing, thereby improving the effect of video encoding and / or decoding.

[0466] Step c2: Determine or obtain at least one second intermediate value based on at least one input feature and at least one lookup table, and perform filtering processing on at least one image patch based on the neural network and at least one second intermediate value;

[0467] Optionally, the second intermediate value can be data generated during an intermediate processing step in the filtering process, such as filtered pixels generated after at least one filtering process, or other values. Optionally, the second intermediate value can be the same as or different from the first intermediate value.

[0468] Optionally, at least one lookup table can be selected from multiple lookup tables using at least one input feature, and the at least one input feature can be entered into the at least one lookup table to find and obtain at least one second intermediate value. Optionally, the lookup table includes a correspondence between at least one input feature and a second intermediate value, and / or an index can be set for each correspondence in the lookup table.

[0469] Optionally, at least one lookup table can be a multi-level lookup table. The pixel values ​​to be filtered in at least one image block can be converted into an index and input into the multi-level lookup table for parallel or serial lookup, outputting at least one second intermediate value. Optionally, the index of the pixel values ​​to be filtered in at least one image block can be updated (e.g., the index value is increased or decreased) based on at least one of the following: quantization parameters, boundary strength, characterization parameters, position information of the pixel to be filtered, size information of the image block, filtering information of neighboring blocks, filtering information of non-neighboring blocks, filtering information of cross-component blocks, filtering information of co-position blocks, filtering information of temporal blocks, filtering information of default blocks, and filtering information of candidate blocks. The updated index is then input into the multi-level lookup table for parallel or serial lookup, outputting at least one second intermediate value.

[0470] Optionally, a lookup table can be used to filter the image based on at least one of the following: quantization parameters, boundary strength, characterization interval, position information of the pixel to be filtered, pixel value to be filtered, size information of the image block, filtering information of neighboring blocks, filtering information of non-neighboring blocks, filtering information of cross-component blocks, filtering information of co-position blocks, filtering information of temporal blocks, filtering information of default blocks, and filtering information of candidate blocks. The pixel value to be filtered can be converted into an index and input into the filtered lookup table for searching to determine the filtered pixel after filtering, and then output as the second intermediate value.

[0471] Optionally, at least one second intermediate value can be input into the neural network to obtain filtered pixels as output, and the target image block can be determined or generated based on the filtered pixels.

[0472] Alternatively, the neural network may be determined based on at least one input feature.

[0473] Alternatively, an image block including at least one second intermediate value can be input into a neural network for filtering, and the output can be a target image block including the filtered pixels.

[0474] In this embodiment, at least one second intermediate value is obtained by searching at least one lookup table based on at least one input feature, and a target image block is determined or generated based on the neural network and at least one second intermediate value. The filtering effect can be improved by using the neural network and the lookup table together, thereby improving the effect of video encoding and / or decoding.

[0475] Step c3: Determine or generate at least one index based on at least one input feature, and perform filtering on at least one image patch based on at least one index and at least one lookup table;

[0476] Optionally, at least one index can be determined by at least one input feature in at least one image block, and the index can be determined by referring to at least one of methods 31 to 37 in the following embodiments.

[0477] Alternatively, the index can be a number, an array, or an identifier, label, etc.

[0478] Optionally, it can be determined whether the index of the pixel value to be filtered of at least one image block needs to be updated based on at least one of the following: quantization parameters, boundary strength, characterization parameters, position information of the pixel to be filtered, size information of the image block, filtering information of neighboring blocks, filtering information of non-neighboring blocks, filtering information of cross-component blocks, filtering information of co-position blocks, filtering information of temporal blocks, filtering information of default blocks, and filtering information of candidate blocks. If necessary, the converted index can be increased or decreased to obtain the updated index.

[0479] Optionally, after determining at least one index, the at least one index can be entered into at least one lookup table for searching to determine the corresponding filtered pixel and output it, and the target image block can be determined or generated based on the output at least one filtered pixel.

[0480] Optionally, after determining or generating at least one index, such as a first index, based on at least one input feature, the first index is input into the first level of the multi-level lookup table for searching to obtain a first search result. Then, a second index is determined or obtained based on the first search result, and the second index is input into the second level of the multi-level lookup table for searching, until the last level of the lookup table outputs the filtered pixel or the target image block containing the filtered pixel. Optionally, the at least one lookup table may include a multi-level lookup table.

[0481] In this embodiment, the target image patch is determined or generated based on at least one index determined or generated according to at least one input feature, and at least one lookup table. By using a lookup table for filtering, the complexity of the filtering process can be reduced, thereby improving the efficiency of video encoding and / or decoding.

[0482] Step c4: Determine or obtain at least one third intermediate value based on the neural network and at least one input feature; determine or obtain at least one fourth intermediate value based on the at least one third intermediate value and at least one lookup table; and perform filtering processing on at least one image patch based on the neural network and at least one fourth intermediate value.

[0483] Optionally, the third and / or fourth intermediate values ​​can be data generated during an intermediate processing step in the filtering process, can be filtered pixels generated after at least one filtering process, or can be other values.

[0484] Optionally, the fourth intermediate value, and / or the third intermediate value, and / or the second intermediate value, and / or the first intermediate value may be the same or different.

[0485] Optionally, the neural network can be determined or filtered based on at least one input feature (such as reconstruction transformation features of different components, and / or prediction transformation features, and / or derived transformation features), and the at least one input feature can be input into the neural network for filtering processing, outputting a third intermediate value. Optionally, at least one image patch and its input features can also be input into the neural network for filtering processing, outputting an image patch after one filtering process. Pixels in the image patch after one filtering process can be used as the third intermediate value.

[0486] Optionally, at least one third intermediate value can be input into at least one lookup table to determine and generate at least one fourth intermediate value. Optionally, at least one third intermediate value can be converted into an index (e.g., directly using the third intermediate value as the index, or transforming the third intermediate value to obtain the index), and then the index can be input into at least one lookup table to find and output at least one fourth intermediate value. Optionally, the lookup table includes the correspondence between the third and fourth intermediate values, and / or indexes can be set in the lookup table, with each correspondence corresponding to one index.

[0487] Optionally, at least one lookup table can be a multi-level lookup table. At least one third intermediate value can be input into the multi-level lookup table for serial or parallel lookup, and at least one fourth intermediate value can be output.

[0488] Optionally, at least one fourth intermediate value can be input into the neural network for model training, and the output can be a target image patch including at least one filtered pixel.

[0489] In this embodiment, at least one input feature is filtered based on the neural network, lookup table, and neural network architecture to determine or generate target image blocks. The filtering effect can be improved by using the neural network and lookup table together, thereby improving the video encoding and / or decoding effect.

[0490] Step c5: Determine or obtain at least one fifth intermediate value based on at least one input feature and at least one lookup table; determine or obtain at least one sixth intermediate value based on at least one fifth intermediate value and a neural network; and perform filtering processing on at least one image block based on at least one sixth intermediate value and at least one lookup table.

[0491] Optionally, the fifth intermediate value and / or the sixth intermediate value can be data generated during an intermediate processing step in the filtering process, which can be filtered pixels generated after at least one filtering process, or other values. Optionally, the sixth intermediate value, and / or the fifth intermediate value, and / or the fourth intermediate value, and / or the third intermediate value, and / or the second intermediate value, and / or the first intermediate value can be the same or different.

[0492] Optionally, at least one lookup table can be selected from multiple lookup tables based on at least one input feature (such as reconstruction transformation features of different components, and / or prediction transformation features, and / or derived transformation features), and the at least one input feature can be input into the at least one lookup table to search and obtain at least one fifth intermediate value. Optionally, an index can also be determined based on at least one input feature, and the index can be input into the at least one lookup table to search and determine at least one fifth intermediate value.

[0493] Optionally, the lookup table includes a correspondence between at least one input feature and a fifth intermediate value, and / or an index may be set for each correspondence in the lookup table.

[0494] Optionally, at least one lookup table can be a multi-level lookup table. The first input feature can be converted into an index input and fed into the multi-level lookup table for parallel or serial lookup, outputting at least one fifth intermediate value.

[0495] Optionally, at least one fifth intermediate value can be input into the neural network for filtering, and the output can be at least one sixth intermediate value. Alternatively, an image patch including at least one fifth intermediate value can be input into the neural network for filtering, and the output can be an image patch including at least one sixth intermediate value.

[0496] Optionally, at least one sixth intermediate value can be input into at least one lookup table to determine the filtered pixel, and the target image block can be determined or generated based on the filtered pixel. Alternatively, at least one sixth intermediate value can be used to determine an index, which can then be input into at least one lookup table to determine the filtered pixel.

[0497] Optionally, the lookup table may also include at least a sixth intermediate value and a correspondence with the filtered pixel, and / or an index may be set for each correspondence in the lookup table.

[0498] Optionally, at least one sixth intermediate value can be input into a multi-level lookup table for parallel or serial lookup, and the output can be either the filtered pixel or the target image block containing the filtered pixel.

[0499] In this embodiment, at least one input feature is filtered based on a lookup table, a neural network, and the architecture of the lookup table to determine or generate a target image patch. The filtering effect can be improved by using a neural network and a lookup table together, thereby improving the video encoding and / or decoding effect.

[0500] Sixth Embodiment

[0501] This application also provides a processing apparatus, referring to... Figure 10 The processing device includes:

[0502] Processing module A10 is used to perform filtering processing on at least one image patch based on multiple input branches, a neural network, and / or a lookup table.

[0503] Optionally, the image patch features include at least one of the following: the reconstruction transform features of the reconstructed block of at least one image patch, the prediction transform features of the predicted block of at least one image patch, and the derivative transform features of the derived block of at least one image patch.

[0504] Optionally, the processing module A10 is used for:

[0505] The input features are determined or obtained by channel stitching based on multiple input branches and features of at least one image patch.

[0506] Filter at least one input feature based on a neural network and / or a lookup table.

[0507] Optionally, multiple input branches include at least one of the following:

[0508] The first input branch is used to process the luminance component;

[0509] The second input branch is used to process the chromaticity components;

[0510] The third input branch is used for mixing the luminance and chrominance components.

[0511] Optionally, the first input branch includes at least one of the following: a first sub-input branch for processing the reconstructed transform features and derived transform features on the luminance component; a second sub-input branch for processing the predicted transform features and derived transform features on the luminance component; a third sub-input branch for processing the reconstructed transform features on the luminance component; a fourth sub-input branch for processing the predicted transform features on the luminance component; and a fifth sub-input branch for processing the derived transform features on the luminance component; and / or,

[0512] The second input branch includes at least one of the following: a sixth sub-input branch for processing the reconstructed transform features and derived transform features on the chroma components; a seventh sub-input branch for processing the predicted transform features and derived transform features on the chroma components; an eighth sub-input branch for processing the reconstructed transform features on the chroma components; a ninth sub-input branch for processing the predicted transform features on the chroma components; a tenth sub-input branch for processing the derived transform features on the chroma components; a fourth input branch for processing the U component; and a fifth input branch for processing the V component.

[0513] Optionally, the processing module A10 is also configured to perform at least one of the following:

[0514] Based on the first input branch, channel splicing is performed on the predicted transformation features of at least one luminance component, the reconstructed transformation features of at least one luminance component, and the derived transformation features of at least one luminance component.

[0515] Based on the second input branch, channel splicing is performed on the predicted transformation features of at least one chromaticity component, the reconstructed transformation features of at least one chromaticity component, and the derived transformation features of at least one luminance component.

[0516] Based on the third input branch, at least one of the following is channel-stitched: the predicted transformation feature of at least one luminance component and / or chrominance component, the derived transformation feature of at least one luminance component and / or chrominance component, and the reconstructed transformation feature of at least one luminance component and / or chrominance component.

[0517] Channel splicing is performed based on the reconstructed transformation features of at least one luminance component and the derived transformation features of at least one luminance component from the first sub-input branch.

[0518] Channel splicing is performed based on the predicted transform features of at least one luminance component and the derived transform features of at least one luminance component from the second sub-input branch.

[0519] Channel splicing is performed based on the reconstruction transformation features of at least one luminance component according to the third sub-input branch;

[0520] Channel splicing is performed based on the predicted transform features of at least one luminance component according to the fourth sub-input branch;

[0521] Channel splicing is performed based on the derived transform features of at least one luminance component according to the fifth sub-input branch;

[0522] Channel splicing is performed based on the reconstruction transformation features of at least one chromaticity component and the derived transformation features of at least one chromaticity component from the sixth sub-input branch.

[0523] Channel splicing is performed based on the predicted transform features of at least one chromaticity component and the derived transform features of at least one chromaticity component from the seventh sub-input branch.

[0524] Channel splicing is performed based on the reconstruction transformation features of at least one chromaticity component according to the eighth sub-input branch;

[0525] Channel splicing is performed based on the predicted transform characteristics of at least one chromaticity component according to the ninth sub-input branch;

[0526] Channel splicing is performed based on the derived transform features of at least one chromaticity component according to the tenth sub-input branch;

[0527] Based on the fourth input branch, channel splicing is performed on the predicted transformation features of at least one U component, the reconstructed transformation features of at least one U component, and the derived transformation features of at least one U component.

[0528] Based on the fifth input branch, channel splicing is performed on the predicted transformation features of at least one V component, the reconstructed transformation features of at least one V component, and the derived transformation features of at least one V component.

[0529] Optionally, the neural network includes a grouped convolutional module for dividing at least one input feature into at least one group for convolutional processing;

[0530] Optionally, at least one group includes at least one of the following:

[0531] At least one set of features corresponding to the first input branch in at least one input feature;

[0532] At least one set of features corresponding to the second input branch in at least one input feature;

[0533] At least one set of features corresponding to the third input branch in at least one input feature;

[0534] At least one set of features used to process the features corresponding to the first sub-input branch of at least one input feature;

[0535] At least one set of features used to process the features corresponding to the second sub-input branch of at least one input feature;

[0536] At least one set of features used to process the features corresponding to the third sub-input branch of at least one input feature;

[0537] At least one set of features used to process the features corresponding to the fourth sub-input branch of at least one input feature;

[0538] At least one set of features corresponding to the fifth sub-input branch in at least one input feature;

[0539] At least one set of features corresponding to the sixth sub-input branch in at least one input feature;

[0540] At least one set of features used to process the features corresponding to the seventh sub-input branch of at least one input feature;

[0541] At least one set of features used to process the features corresponding to the eighth sub-input branch of at least one input feature;

[0542] At least one set of features corresponding to the ninth sub-input branch in at least one input feature;

[0543] At least one set of features used to process the features corresponding to the tenth sub-input branch of at least one input feature;

[0544] At least one set of features used to process the features corresponding to the fourth input branch of at least one input feature;

[0545] Used to process at least one set of features corresponding to the fifth input branch of at least one input feature;

[0546] At least one set of at least one channel from a plurality of channels for each input branch.

[0547] Optionally, the input branch is at least one of the multiple input branches corresponding to at least one input feature.

[0548] Optionally, the processing module A10 is also configured to perform at least one of the following:

[0549] The grouped convolutional module includes at least one convolutional module corresponding to a group, and the convolutional modules corresponding to at least one group are independent of each other;

[0550] At least two groups contain the same number of input channels;

[0551] At least two groups have the same convolutional layer parameters;

[0552] The grouped convolutional module includes, in sequence, a predecessor layer, a first channel shuffling layer, and a subsequent convolutional module for each group;

[0553] The input to the first channel shuffling layer is the output of at least two groups of precursor layers, and the output of the first channel shuffling layer is the input of at least two groups of subsequent convolutional modules;

[0554] The precursor layer includes at least one convolutional module for convolutional processing of at least one set of features;

[0555] The first channel shuffling layer is used to perform cross-group permutation of the output feature map of the predecessor layer in the channel dimension according to the first channel rearrangement rule;

[0556] The grouped convolutional module includes a second channel shuffling layer and a convolutional module for each group, and the input of the convolutional module for each group is the output of the second channel shuffling layer;

[0557] The second channel shuffling layer is used to perform cross-group permutation of the input feature map, which includes at least two groups of features, in the channel dimension according to the second channel rearrangement rule;

[0558] The grouped convolutional module sequentially includes a second channel shuffling layer, a precursor layer for each group, a first channel shuffling layer, and a subsequent convolutional module for each group.

[0559] Optionally, the processing module A10 is also configured to perform at least one of the following:

[0560] At least one first intermediate value is determined or obtained based on a neural network and at least one input feature, and at least one image block is filtered based on the at least one first intermediate value and at least one lookup table;

[0561] At least one second intermediate value is determined or obtained based on at least one input feature and at least one lookup table, and at least one image patch is filtered based on a neural network and at least one second intermediate value;

[0562] Determine or generate at least one index based on at least one input feature, and perform filtering on at least one image patch based on at least one index and at least one lookup table;

[0563] At least one third intermediate value is determined or obtained based on a neural network and at least one input feature; at least one fourth intermediate value is determined or obtained based on at least one third intermediate value and at least one lookup table; and at least one image patch is filtered based on a neural network and at least one fourth intermediate value.

[0564] At least one fifth intermediate value is determined or obtained based on at least one input feature and at least one lookup table, at least one sixth intermediate value is determined or obtained based on at least one fifth intermediate value and a neural network, and at least one image patch is filtered based on at least one sixth intermediate value and at least one lookup table.

[0565] Optionally, the derived block is determined or obtained by at least one of the following:

[0566] Cropping results obtained by cropping the reconstructed block and / or predicted block of at least one image patch;

[0567] The filling result of filling the reconstructed block and / or predicted block of at least one image patch;

[0568] The update result of pixel updates for at least one image patch's reconstructed block and / or predicted block;

[0569] The translation result of pixel translation of at least one image patch's reconstructed block and / or predicted block;

[0570] The result of pixel-by-pixel editing of at least one image patch's reconstructed block and / or predicted block.

[0571] The processing device provided in this application embodiment is similar in implementation principle and beneficial effect to the technical solution shown in the corresponding method embodiment above, and will not be described again here.

[0572] This application also provides a processing device, including a memory and a processor. The memory stores an image processing program, and when the image processing program is executed by the processor, it implements the steps of the image processing method in any of the above embodiments.

[0573] This application also provides a storage medium storing an image processing program, which, when executed by a processor, implements the steps of the image processing method in any of the above embodiments.

[0574] In the embodiments of the processing device and storage medium provided in this application, all the technical features of any of the above-described image processing method embodiments may be included. The extended and explanatory content of the specification is basically the same as that of the embodiments of the above methods, and will not be repeated here.

[0575] This application also provides a computer program product, which includes computer program code. When the computer program code is run on a computer, it causes the computer to perform the methods described in the various possible implementations above.

[0576] This application also provides a chip, including a memory and a processor. The memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that a device with the chip installed performs the methods described in the various possible implementations above.

[0577] It is understood that the above scenarios are merely examples and do not constitute a limitation on the application scenarios of the technical solutions provided in the embodiments of this application. The technical solutions of this application can also be applied to other scenarios. For example, as those skilled in the art will know, with the evolution of system architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of this application are also applicable to similar technical problems.

[0578] The sequence numbers of the embodiments in this application are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0579] The steps in the method of this application embodiment can be adjusted, combined, or deleted according to actual needs.

[0580] The units in the device of this application embodiment can be merged, divided, and deleted according to actual needs.

[0581] In this application, identical or similar terms, concepts, technical solutions, and / or application scenario descriptions are generally described in detail only the first time they appear. Subsequent repetitions are generally omitted for brevity. When understanding the technical solutions of this application, for identical or similar terms, concepts, technical solutions, and / or application scenario descriptions not described in detail later, reference can be made to their preceding detailed descriptions. In this application, the descriptions of each embodiment have their own emphasis; parts not detailed or recorded in a certain embodiment can be referred to in the relevant descriptions of other embodiments. The technical features of the technical solutions of this application can be combined arbitrarily. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as the combinations of these technical features do not contradict each other, they should be considered within the scope of this application.

[0582] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) as described above, and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, controlled terminal, or network device, etc.) to execute the methods of each embodiment of this application.

[0583] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented, in whole or in part, as a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of this application is generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a storage medium or transmitted from one storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.) means. The storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, storage disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., a solid-state disk (SSD)).

[0584] The above are merely preferred embodiments of this application and do not limit the patent scope of this application. Any equivalent structural or procedural transformations made using the content of this application's specification and drawings, or direct or indirect applications in other related technical fields, are similarly included within the patent protection scope of this application.

Claims

1. An image processing method, characterized in that, Including the following steps: S1, filtering at least one image patch based on multiple input branches, a neural network, and / or a lookup table; Step S1 includes the following steps: S11, Channel stitching is performed based on multiple input branches and at least one image patch feature to determine or obtain input features. The image patch feature includes derivative transformation features of derivative blocks of at least one image patch. The derivative block includes overlapping region information of the image patch. The derivative transformation feature includes pixel value and / or pixel position of the derivative block. S12, filter at least one input feature based on a neural network and / or lookup table; The multi-input branch includes at least one of the following: a first input branch for processing the luminance component; a second input branch for processing the chrominance component; and a third input branch for mixing and processing the luminance and chrominance components. The derived block is determined or obtained by at least one of the following: a cropping result of cropping the reconstructed block and / or predicted block of at least one image block; a filling result of filling the reconstructed block and / or predicted block of at least one image block; an update result of pixel updating the reconstructed block and / or predicted block of at least one image block; a translation result of pixel translation the reconstructed block and / or predicted block of at least one image block; or an editing result of pixel editing the reconstructed block and / or predicted block of at least one image block. The channel stitching based on multiple input branches and at least one image patch feature includes at least one of the following: Based on the first input branch, channel splicing is performed on the predicted transformation features of at least one luminance component, the reconstructed transformation features of at least one luminance component, and the derived transformation features of at least one luminance component. Based on the second input branch, channel splicing is performed on the predicted transformation features of at least one chromaticity component, the reconstructed transformation features of at least one chromaticity component, and the derived transformation features of at least one luminance component. Based on the third input branch, at least one of the following is channel-stitched: the predicted transformation feature of at least one luminance component and / or chrominance component, the derived transformation feature of at least one luminance component and / or chrominance component, and the reconstructed transformation feature of at least one luminance component and / or chrominance component.

2. The image processing method as described in claim 1, characterized in that, The image patch features also include at least one of the following: the reconstruction transformation features of the reconstructed block of at least one image patch, and the prediction transformation features of the prediction block of at least one image patch.

3. The image processing method as described in claim 1, characterized in that, The first input branch includes at least one of the following: a first sub-input branch for processing the reconstructed transform features and derived transform features on the luminance component; a second sub-input branch for processing the predicted transform features and derived transform features on the luminance component; a third sub-input branch for processing the reconstructed transform features on the luminance component; a fourth sub-input branch for processing the predicted transform features on the luminance component; and a fifth sub-input branch for processing the derived transform features on the luminance component. And / or, The second input branch includes at least one of the following: a sixth sub-input branch for processing the reconstructed transform features and derived transform features on the chroma components; a seventh sub-input branch for processing the predicted transform features and derived transform features on the chroma components; an eighth sub-input branch for processing the reconstructed transform features on the chroma components; a ninth sub-input branch for processing the predicted transform features on the chroma components; a tenth sub-input branch for processing the derived transform features on the chroma components; a fourth input branch for processing the U component; and a fifth input branch for processing the V component.

4. The image processing method as described in claim 3, characterized in that, Channel stitching based on multiple input branches and features of at least one image patch also includes at least one of the following: Channel splicing is performed based on the reconstructed transformation features of at least one luminance component and the derived transformation features of at least one luminance component from the first sub-input branch. Channel splicing is performed based on the predicted transform features of at least one luminance component and the derived transform features of at least one luminance component from the second sub-input branch. Channel splicing is performed based on the reconstruction transformation features of at least one luminance component according to the third sub-input branch; Channel splicing is performed based on the predicted transform features of at least one luminance component according to the fourth sub-input branch; Channel splicing is performed based on the derived transform features of at least one luminance component according to the fifth sub-input branch; Channel splicing is performed based on the reconstruction transformation features of at least one chromaticity component and the derived transformation features of at least one chromaticity component from the sixth sub-input branch. Channel splicing is performed based on the predicted transform features of at least one chromaticity component and the derived transform features of at least one chromaticity component from the seventh sub-input branch. Channel splicing is performed based on the reconstruction transformation features of at least one chromaticity component according to the eighth sub-input branch; Channel splicing is performed based on the predicted transform characteristics of at least one chromaticity component according to the ninth sub-input branch; Channel splicing is performed based on the derived transform features of at least one chromaticity component according to the tenth sub-input branch; Based on the fourth input branch, channel splicing is performed on the predicted transformation features of at least one U component, the reconstructed transformation features of at least one U component, and the derived transformation features of at least one U component. Based on the fifth input branch, channel splicing is performed on the predicted transform features of at least one V component, the reconstructed transform features of at least one V component, and the derived transform features of at least one V component.

5. The image processing method as described in claim 3, characterized in that, The neural network includes a grouped convolutional module for dividing at least one input feature into at least one group for convolutional processing, and / or, at least one group includes at least one of the following: At least one set of features corresponding to the first input branch in at least one input feature; At least one set of features corresponding to the second input branch in at least one input feature; At least one set of features corresponding to the third input branch in at least one input feature; At least one set of features used to process the features corresponding to the first sub-input branch of at least one input feature; At least one set of features used to process the features corresponding to the second sub-input branch of at least one input feature; At least one set of features used to process the features corresponding to the third sub-input branch of at least one input feature; At least one set of features used to process the features corresponding to the fourth sub-input branch of at least one input feature; At least one set of features corresponding to the fifth sub-input branch in at least one input feature; At least one set of features corresponding to the sixth sub-input branch in at least one input feature; At least one set of features used to process the features corresponding to the seventh sub-input branch of at least one input feature; At least one set of features used to process the features corresponding to the eighth sub-input branch of at least one input feature; At least one set of features corresponding to the ninth sub-input branch in at least one input feature; At least one set of features used to process the features corresponding to the tenth sub-input branch of at least one input feature; At least one set of features used to process the features corresponding to the fourth input branch of at least one input feature; Used to process at least one set of features corresponding to the fifth input branch of at least one input feature; At least one set of at least one channel from a plurality of channels for each input branch.

6. The image processing method as described in claim 5, characterized in that, It also includes at least one of the following: The grouped convolutional module includes at least one convolutional module corresponding to a group, and the convolutional modules corresponding to at least one group are independent of each other; At least two groups contain the same number of input channels; At least two groups have the same convolutional layer parameters; The input branch is at least one of the multiple input branches corresponding to at least one input feature; The grouped convolutional module includes, in sequence, a predecessor layer, a first channel shuffling layer, and a subsequent convolutional module for each group; The input to the first channel shuffling layer is the output of at least two groups of precursor layers, and the output of the first channel shuffling layer is the input of at least two groups of subsequent convolutional modules; The precursor layer includes at least one convolutional module for convolutional processing of at least one set of features; The first channel shuffling layer is used to perform cross-group permutation of the output feature map of the predecessor layer in the channel dimension according to the first channel rearrangement rule; The grouped convolutional module includes a second channel shuffling layer and a convolutional module for each group, and the input of the convolutional module for each group is the output of the second channel shuffling layer; The second channel shuffling layer is used to perform cross-group permutation of the input feature map, which includes at least two groups of features, in the channel dimension according to the second channel rearrangement rule; The grouped convolutional module sequentially includes a second channel shuffling layer, a precursor layer for each group, a first channel shuffling layer, and a subsequent convolutional module for each group.

7. The image processing method according to any one of claims 2 to 6, characterized in that, Step S12 includes at least one of the following: At least one first intermediate value is determined or obtained based on a neural network and at least one input feature, and at least one image block is filtered based on the at least one first intermediate value and at least one lookup table; At least one second intermediate value is determined or obtained based on at least one input feature and at least one lookup table, and at least one image patch is filtered based on a neural network and at least one second intermediate value; Determine or generate at least one index based on at least one input feature, and perform filtering on at least one image patch based on at least one index and at least one lookup table; At least one third intermediate value is determined or obtained based on a neural network and at least one input feature; at least one fourth intermediate value is determined or obtained based on at least one third intermediate value and at least one lookup table; and at least one image patch is filtered based on a neural network and at least one fourth intermediate value. At least one fifth intermediate value is determined or obtained based on at least one input feature and at least one lookup table, at least one sixth intermediate value is determined or obtained based on at least one fifth intermediate value and a neural network, and at least one image patch is filtered based on at least one sixth intermediate value and at least one lookup table.

8. A processing apparatus, characterized in that, include: A memory and a processor, wherein the memory stores an image processing program, and the image processing program, when executed by the processor, implements the steps of the image processing method as described in any one of claims 1 to 7.

9. A storage medium, characterized in that, The storage medium stores a computer program, which, when executed by a processor, implements the steps of the image processing method as described in any one of claims 1 to 7.