Image processing method, processing device and storage medium
By using a filter lookup table combined with a neural network in the video coding framework, the complexity of loop filtering is reduced, the efficiency of video encoding and decoding is improved, and the problem of high complexity of neural network loop filters is solved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHENZHEN TRANSSION HLDG CO LTD
- Filing Date
- 2025-01-06
- Publication Date
- 2026-06-12
Smart Images

Figure CN119854487B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image processing technology, specifically to an image processing method, processing device, and storage medium. Background Technology
[0002] Existing high-efficiency video coding frameworks, such as Neural Network Based Video Coding (NNVC) and / or Enhanced Compression Model (ECM), propose a video frame coding technique to improve coding performance without significantly increasing computational complexity.
[0003] In conceiving and implementing this application, the inventors discovered at least the following problem: the high complexity of neural networks in the loop filtering stage of the encoding and decoding process limits its practicality. For example, when using a Low Complexity Neural Network Loop Filter (LC-NNLF) structure for loop filtering, the presence of a Candecomp Parafac (Tensor Decomposition) step in the LC-NNLF structure, coupled with its large number of parameters and operands, leads to high complexity in the filtering process, thereby limiting the efficiency of video encoding and / or decoding. The foregoing description is intended to provide general background information and does not necessarily constitute prior art. Summary of the Invention
[0004] To address the aforementioned technical problems, this application provides an image processing method, processing device, and storage medium, aiming to solve the technical problem of how to reduce the complexity of filtering processing, thereby improving the efficiency of video encoding and / or decoding.
[0005] This application provides an image processing method, applicable to a processing device, comprising the following steps:
[0006] S10, determine or generate a target image block based on reference parameters of at least one image block and a filter lookup table.
[0007] Optionally, the reference parameters include at least one of the following:
[0008] Quantization parameters; boundary strength; characterization parameters; location information of the pixel to be filtered; pixel value to be filtered; size information of the image block; filtering information of neighboring blocks; filtering information of non-neighboring blocks; filtering information of cross-component blocks; filtering information of co-position blocks; filtering information of temporal blocks; filtering information of default blocks; filtering information of candidate blocks.
[0009] Optionally, candidate blocks are determined by motion vectors and / or block vectors.
[0010] Optionally, step S10 includes at least one of the following:
[0011] At least one first intermediate value is determined or generated based on a neural network and at least one reference parameter, and a target image patch is determined or generated based on the at least one first intermediate value and at least one filter lookup table.
[0012] Based on at least one reference parameter, at least one second intermediate value is obtained by searching in at least one filter lookup table, and a target image patch is determined or generated based on the neural network and the at least one second intermediate value.
[0013] Determine or generate at least one index based on at least one reference parameter, and determine or generate a target image patch based on at least one index and at least one filter lookup table;
[0014] Based on a neural network and at least one reference parameter, determine or generate at least one third intermediate value; based on at least one third intermediate value and at least one filter lookup table, determine or generate at least one fourth intermediate value; based on a neural network and at least one fourth intermediate value, determine or generate a target image patch.
[0015] At least one fifth intermediate value is determined or generated based on at least one reference parameter and at least one filter lookup table; at least one sixth intermediate value is determined or generated based on at least one fifth intermediate value and a neural network; and a target image patch is determined or generated based on at least one sixth intermediate value and at least one filter lookup table.
[0016] Optionally, the neural network includes at least one of the following:
[0017] Neural networks based on fully connected layers;
[0018] Neural networks based on convolutional layers;
[0019] Neural networks based on Transformer;
[0020] Neural networks based on hybrid convolutional and fully connected layers and Transformers.
[0021] Optionally, when the neural network includes a convolutional layer-based neural network, determining or generating at least one first intermediate value based on the neural network and at least one reference parameter includes:
[0022] At least one image block input to a neural network containing a convolutional module is processed to obtain at least one second image block;
[0023] At least one second image block is processed based on the convolution module and reference parameters to determine or generate at least one first intermediate value.
[0024] Optionally, the convolutional layers of the convolutional module include at least one of the following:
[0025] Asymmetric convolutional layers;
[0026] Grouped convolutional layers;
[0027] Partially convolutional convolutional layers;
[0028] Convolutional layers with depth separable convolutions.
[0029] Optionally, the image processing method further includes at least one of the following:
[0030] At least two convolutional layers in a convolutional module have the same convolutional template;
[0031] In a convolutional module, at least two convolutional layers correspond to convolutional templates that are all different from each other.
[0032] The pixel positions where the convolution kernel operates are different in at least two convolution templates;
[0033] The number and position of pixels in the convolution template are determined based on the receptive field.
[0034] Optionally, at least one index is determined or generated based on at least one reference parameter, including at least one of the following:
[0035] Determine or generate a first pixel parameter based on at least one reference parameter, and determine or generate at least one index based on the first pixel parameter;
[0036] Based on at least one reference parameter and at least one representation lookup table, determine or generate at least one first representation value, and based on at least one first representation value, determine or generate at least one index;
[0037] A second pixel parameter is determined or generated based on at least one reference parameter, a second representation value is determined or generated based on the second pixel parameter and at least one representation lookup table, and an index is determined or generated based on the at least one second representation value.
[0038] Determine or generate at least one third characterization value based on at least one reference parameter and at least one characterization lookup table; determine or generate a third pixel parameter based on at least one third characterization value; determine or generate at least one index based on the third pixel parameter.
[0039] Determine or generate at least one fourth characterization value based on the most significant bit of at least one pixel value to be filtered, and determine or generate at least one index based on the at least one fourth characterization value;
[0040] Determine or generate at least one fifth characterization value based on the characterization parameters, and determine or generate at least one index based on the at least one fifth characterization value;
[0041] Determine or generate at least one non-representational value based on at least one reference parameter, and determine or generate at least one index based on at least one non-representational value.
[0042] This application also provides a processing apparatus, including:
[0043] The processing module is used to determine or generate a target image block based on reference parameters of at least one image block and a filter lookup table.
[0044] This application also provides a processing device, including: a memory and a processor, wherein the memory stores an image processing program, and when the image processing program is executed by the processor, it implements the steps of any of the image processing methods described above.
[0045] This application also provides a storage medium storing a computer program that, when executed by a processor, implements the steps of any of the image processing methods described above.
[0046] As described above, the processing method of this application can be applied to a processing device, including the steps of: determining or generating a target image block based on reference parameters of at least one image block and a filter lookup table. Through the technical solution of this application, the complexity of the filter can be reduced when filtering at least one image block using a filter lookup table; and / or since the lookup table can be constructed based on a neural network, the filtering effect of the filtering process can be guaranteed, while avoiding the direct use of a neural network for filtering, thereby improving the efficiency of video encoding and / or decoding. Attached Figure Description
[0047] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application. To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the description of the embodiments will be briefly introduced below. Obviously, those skilled in the art can obtain other drawings based on these drawings without any creative effort.
[0048] Figure 1 A schematic diagram of the hardware structure of a mobile terminal to implement the various embodiments of this application;
[0049] Figure 2 A communication network system architecture diagram provided for an embodiment of this application;
[0050] Figure 3 A schematic diagram of the hardware structure of a controller 140 provided in this application;
[0051] Figure 4 A schematic diagram of the hardware structure of a network node 150 provided in this application;
[0052] Figure 5 This is a flowchart illustrating the image processing method according to the first embodiment;
[0053] Figure 6 This is a flowchart illustrating the encoding and decoding process in image processing methods.
[0054] Figure 7 This is a schematic diagram of a convolution template in the image processing method shown in the first embodiment;
[0055] Figure 8 This is a schematic diagram of the image processing method using a filter lookup table according to the first embodiment;
[0056] Figure 9 This is a schematic diagram illustrating the use of a neural network in the image processing method according to the fourth embodiment;
[0057] Figure 10 This is a schematic diagram of the convolution templates corresponding to each convolution layer in the convolution module of the image processing method shown in the fourth embodiment;
[0058] Figure 11 This is a schematic diagram of a convolution template for the operation of a 1x3 convolution kernel with a small receptive field in the image processing method shown in the fourth embodiment;
[0059] Figure 12 This is a schematic diagram of the convolution template corresponding to a 1x3 convolution kernel with an approximate 3x3 receptive field in the image processing method shown in the fourth embodiment;
[0060] Figure 13 This is a schematic diagram of the convolution template corresponding to a 1x2 convolution kernel with an approximate 3x3 convolution kernel in the image processing method shown in the fourth embodiment;
[0061] Figure 14 This is a schematic diagram of the convolution template corresponding to a 1x3 convolution kernel with an approximate 4x4 convolution kernel in the image processing method shown in the fourth embodiment;
[0062] Figure 15 This is another schematic diagram illustrating the use of a neural network in the image processing method according to the fourth embodiment;
[0063] Figure 16 This is a schematic diagram of the image processing method according to the fourth embodiment, in which a neural network combines a convolutional template and reference parameters for filtering.
[0064] Figure 17 This is a schematic diagram of the processing module of the processing device.
[0065] The realization of the objectives, functional features, and advantages of this application will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. The accompanying drawings have illustrated specific embodiments of this application, which will be described in more detail below. These drawings and textual descriptions are not intended to limit the scope of the concept in any way, but rather to illustrate the concepts of this application to those skilled in the art through reference to specific embodiments. Detailed Implementation
[0066] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.
[0067] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element. Furthermore, components, features, and elements with the same names in different embodiments of this application may have the same meaning or different meanings, the specific meaning of which must be determined by its interpretation in that specific embodiment or further in conjunction with the context of that specific embodiment.
[0068] It should be understood that although the terms first, second, third, etc., may be used herein to describe various information, such information should not be limited to these terms. These terms are used only to distinguish information of the same type from one another. For example, without departing from the scope of this document, first information may also be referred to as second information, and similarly, second information may also be referred to as first information. Depending on the context, the word “if” as used herein may be interpreted as “when…” or “in response to determination”. Furthermore, as used herein, the singular forms “a,” “an,” and “the” are intended to also include the plural forms unless the context indicates otherwise. It should be further understood that the terms “comprising,” “including,” indicate the presence of the stated feature, step, operation, element, component, item, kind, and / or group, but do not exclude the presence, occurrence, or addition of one or more other features, steps, operations, elements, components, items, kinds, and / or groups. The terms “or,” “and / or,” “including at least one of the following,” etc., as used in this application, may be interpreted as inclusive, or mean any one or any combination thereof. For example, "including at least one of the following: A, B, C" means "any one of the following: A; B; C; A and B; A and C; B and C; A and B and C." Similarly, "A, B, or C" or "A, B, and / or C" means "any one of the following: A; B; C; A and B; A and C; B and C; A and B and C." Exceptions to this definition only occur when the combination of elements, functions, steps, or operations is inherently mutually exclusive in some way.
[0069] It should be understood that although the steps in the flowcharts of this application's embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some of the steps in the figures may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times, and their execution order is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.
[0070] Depending on the context, the words “if” or “suppose” as used here can be interpreted as “when” or “in response to determination” or “in response to detection.” Similarly, depending on the context, the phrases “if determination” or “if detection (of the stated condition or event)” can be interpreted as “when determination” or “in response to determination” or “when detection (of the stated condition or event)” or “in response to detection (of the stated condition or event).”
[0071] It should be noted that step designations such as S10 and S20 are used in this document for the purpose of more clearly and concisely describing the corresponding content, and do not constitute a substantial limitation on the order. In specific implementation, those skilled in the art may execute S20 first and then S10, etc., but these should all be within the protection scope of this application.
[0072] It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to limit this application.
[0073] In the following description, the use of suffixes such as "module," "part," or "unit" to denote elements is solely for the purpose of illustrative purposes and has no specific meaning in itself. Therefore, "module," "part," or "unit" may be used interchangeably.
[0074] The processing device in this application can be a smart terminal or a server. Optionally, the smart terminal can be implemented in various forms. For example, the smart terminal described in this application can include smart terminals such as mobile phones, tablets, laptops, handheld computers, personal digital assistants (PDAs), portable media players (PMPs), navigation devices, wearable devices, smart bracelets, pedometers, etc., as well as fixed terminals such as digital TVs and desktop computers.
[0075] The following description will use a mobile terminal as an example. Those skilled in the art will understand that, apart from elements specifically designed for mobile purposes, the construction according to the embodiments of this application can also be applied to fixed-type terminals.
[0076] Please see Figure 1 This is a schematic diagram of the hardware structure of a mobile terminal implementing various embodiments of this application. The mobile terminal 100 may include: an RF (Radio Frequency) unit 101, a WiFi module 102, an audio output unit 103, an A / V (Audio / Video) input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, a processor 110, and a power supply 111, etc. Those skilled in the art will understand that... Figure 1 The mobile terminal structure shown does not constitute a limitation on the mobile terminal. The mobile terminal may include more or fewer components than shown, or combine certain components, or have different component arrangements.
[0077] The following is combined Figure 1 A detailed introduction to each component of the mobile terminal:
[0078] The radio frequency unit 101 can be used for receiving and transmitting signals during information transmission or calls. Specifically, it receives downlink information from the base station and processes it with the processor 110; additionally, it transmits uplink data to the base station. Typically, the radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low-noise amplifier, and a duplexer. Furthermore, the radio frequency unit 101 can also communicate wirelessly with networks and other devices. The aforementioned wireless communications may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA2000 (Code Division Multiple Access 2000), WCDMA (Wideband Code Division Multiple Access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access), FDD-LTE (Frequency Division Duplexing-Long Term Evolution), TDD-LTE (Time Division Duplexing-Long Term Evolution), 5G, and 6G.
[0079] WiFi is a short-range wireless transmission technology. Mobile terminals, through the WiFi module 102, can help users send and receive emails, browse web pages, and access streaming media, providing users with wireless broadband internet access. Although Figure 1 WiFi module 102 is shown, but it is understood that it is not a necessary component of a mobile terminal and can be omitted as needed without changing the nature of the invention.
[0080] The audio output unit 103 can convert audio data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into audio signals and output them as sound when the mobile terminal 100 is in call signal receiving mode, call mode, recording mode, voice recognition mode, broadcast receiving mode, etc. Furthermore, the audio output unit 103 can also provide audio output related to specific functions performed by the mobile terminal 100 (e.g., call signal receiving sound, message receiving sound, etc.). The audio output unit 103 may include a speaker, a buzzer, etc.
[0081] The A / V input unit 104 is used to receive audio or video signals. The A / V input unit 104 may include a graphics processing unit (GPU) 1041 and a microphone 1042. The GPU 1041 processes image data of still images or videos acquired by an image capture device (such as a camera) in video capture mode or image capture mode. The processed image frames can be displayed on the display unit 106. The image frames processed by the GPU 1041 can be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 can receive sound (audio data) in operating modes such as telephone call mode, recording mode, and voice recognition mode, and can process such sound into audio data. The processed audio (voice) data can be converted into a format that can be transmitted to a mobile communication base station via the radio frequency unit 101 in telephone call mode. The microphone 1042 can implement various types of noise cancellation (or suppression) algorithms to eliminate (or suppress) noise or interference generated during the reception and transmission of audio signals.
[0082] The mobile terminal 100 also includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Optionally, the light sensor includes an ambient light sensor and a proximity sensor. Optionally, the ambient light sensor can adjust the brightness of the display panel 1061 according to the ambient light level, and the proximity sensor can turn off the display panel 1061 and / or backlight when the mobile terminal 100 is moved to the ear. As a type of motion sensor, an accelerometer sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when stationary. It can be used for applications that recognize the phone's posture (such as landscape / portrait switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer, tapping), etc. Other sensors that may be configured in the phone, such as fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, and infrared sensors, will not be described in detail here.
[0083] The display unit 106 is used to display information input by the user or information provided to the user. The display unit 106 may include a display panel 1061, which may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
[0084] User input unit 107 can be used to receive input numerical or character information, and generate key signal inputs related to user settings and function control of the mobile terminal. Optionally, user input unit 107 may include touch panel 1071 and other input devices 1072. Touch panel 1071, also known as a touch screen, can collect touch operations performed by the user on or near it (such as operations performed by the user using a finger, stylus, or any suitable object or accessory on or near touch panel 1071), and drive corresponding connection devices according to a pre-set program. Touch panel 1071 may include a touch detection device and a touch controller. Optionally, the touch detection device detects the user's touch position and the signal generated by the touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into touch point coordinates, sends it to processor 110, and can receive and execute commands sent by processor 110. In addition, touch panel 1071 can be implemented using various types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 1071, the user input unit 107 may also include other input devices 1072. Optionally, other input devices 1072 may include, but are not limited to, one or more of the following: physical keyboard, function keys (such as volume control buttons, power buttons, etc.), trackball, mouse, joystick, etc., without being specifically limited here.
[0085] Optionally, the touch panel 1071 may cover the display panel 1061. When the touch panel 1071 detects a touch operation on or near it, it transmits the information to the processor 110 to determine the type of touch event. Subsequently, the processor 110 provides corresponding visual output on the display panel 1061 based on the type of touch event. Although in Figure 1 In this embodiment, the touch panel 1071 and the display panel 1061 are two independent components to realize the input and output functions of the mobile terminal. However, in some embodiments, the touch panel 1071 and the display panel 1061 can be integrated to realize the input and output functions of the mobile terminal. The specific implementation is not limited here.
[0086] Interface unit 108 serves as an interface through which at least one external device can connect to mobile terminal 100. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device with an identification module, an audio input / output (I / O) port, a video I / O port, a headphone port, and so on. Interface unit 108 may be used to receive input (e.g., data, power, etc.) from the external device and transmit the received input to one or more elements within mobile terminal 100, or it may be used to transmit data between mobile terminal 100 and the external device.
[0087] The memory 109 can be used to store software programs and various data. The memory 109 may primarily include a program storage area and a data storage area. Optionally, the program storage area may store the operating system, applications required for at least one function (such as sound playback, image playback, etc.), etc.; the data storage area may store data created based on the use of the mobile phone (such as audio data, phonebook, etc.). Furthermore, the memory 109 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device.
[0088] The processor 110 is the control center of the mobile terminal. It connects various parts of the mobile terminal via various interfaces and lines. By running or executing software programs and / or modules stored in the memory 109, and by calling data stored in the memory 109, it performs various functions and processes data of the mobile terminal, thereby providing overall monitoring of the mobile terminal. The processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor and a modem processor. Optionally, the application processor mainly handles the operating system, user interface, and applications, while the modem processor mainly handles wireless communication. It is understood that the modem processor may not be integrated into the processor 110.
[0089] The mobile terminal 100 may also include a power supply 111 (such as a battery) that supplies power to various components. Preferably, the power supply 111 can be logically connected to the processor 110 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system.
[0090] although Figure 1 As not shown, the mobile terminal 100 may also include a Bluetooth module, etc., which will not be described in detail here.
[0091] To facilitate understanding of the embodiments of this application, the communication network system on which the mobile terminal of this application is based is described below.
[0092] Please see Figure 2 , Figure 2 This application provides a communication network system architecture diagram. The communication network system is an LTE system based on the universal mobile communication technology. The LTE system includes a UE (User Equipment) 201, an E-UTRAN (Evolved UMTS Terrestrial Radio Access Network) 202, an EPC (Evolved Packet Core) 203, and the operator's IP services 204, which are connected in sequence.
[0093] Optionally, UE201 can be the aforementioned terminal 100, which will not be described in detail here.
[0094] E-UTRAN202 includes eNodeB2021 and other eNodeB2022, etc. Optionally, eNodeB2021 can connect to other eNodeB2022 via backhaul (e.g., X2 interface), and eNodeB2021 connects to EPC203, providing access from UE201 to EPC203.
[0095] EPC203 may include MME (Mobility Management Entity) 2031, HSS (Home Subscriber Server) 2032, other MMEs 2033, SGW (Serving Gateway) 2034, PGW (Packet Data Network Gateway) 2035, and PCRF (Policy and Charging Rules Function) 2036, etc. Optionally, MME2031 is the control node that handles signaling between UE201 and EPC203, providing bearer and connection management. HSS2032 is used to provide registers to manage functions such as the Home Location Register (not shown in the figure) and stores user-specific information such as service characteristics and data rates. All user data can be sent through SGW2034. PGW2035 can provide UE 201 IP address allocation and other functions. PCRF2036 is the policy and charging control decision point for service data flow and IP bearer resources. It selects and provides available policy and charging control decisions for the policy and charging enforcement function unit (not shown in the figure).
[0096] IP services 204 may include the Internet, intranet, IMS (IP Multimedia Subsystem), or other IP services.
[0097] Although the above description uses the LTE system as an example, those skilled in the art should know that this application is not only applicable to the LTE system, but also to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA, 5G and future new network systems (such as 6G), etc., without limitation.
[0098] Figure 3This is a schematic diagram of the hardware structure of a controller 140 provided in this application. The controller 140 includes a memory 1401 and a processor 1402. The memory 1401 is used to store program instructions, and the processor 1402 is used to call the program instructions in the memory 1401 to execute the steps performed by the controller in the first embodiment of the above method. The implementation principle and beneficial effects are similar, and will not be described again here.
[0099] Optionally, the controller further includes a communication interface 1403, which can be connected to the processor 1402 via a bus 1404. The processor 1402 can control the communication interface 1403 to implement the receiving and sending functions of the controller 140.
[0100] Figure 4 This application provides a schematic diagram of the hardware structure of a network node 150. The network node 150 includes a memory 1501 and a processor 1502. The memory 1501 is used to store program instructions, and the processor 1502 is used to call the program instructions in the memory 1501 to execute the steps performed by the first node in the first embodiment of the above method. The implementation principle and beneficial effects are similar, and will not be described again here.
[0101] Optionally, the controller further includes a communication interface 1503, which can be connected to the processor 1502 via a bus 1504. The processor 1502 can control the communication interface 1503 to implement the receiving and sending functions of the network node 150.
[0102] The integrated modules described above, implemented as software functional modules, can be stored in a computer-readable storage medium. These software functional modules, stored in a storage medium, include several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute some steps of the methods of the various embodiments of this application.
[0103] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of this application is generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a storage medium or transmitted from one storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state disk, SSD), etc.
[0104] Based on the above-described mobile terminal hardware structure and communication network system, various embodiments of this application are proposed.
[0105] First Embodiment
[0106] Reference Figure 5 , Figure 5 This is a flowchart illustrating the image processing method according to the first embodiment. The image processing method of this application embodiment can be applied to a processing device, including step S10:
[0107] Step S10: Determine or generate a target image block based on reference parameters of at least one image block and a filter lookup table.
[0108] In this embodiment, the processing device can be a smart terminal, such as a mobile phone or computer, or a server, such as a local server or a cloud server. This embodiment and this application primarily use a smart terminal as an example for illustration.
[0109] Optionally, the technical solution of this embodiment can be applied to fields such as image encoding and decoding, video encoding and decoding, hardware video encoding and decoding, dedicated circuit video encoding and decoding, and real-time video encoding and decoding.
[0110] Optionally, for ease of understanding, a brief introduction to the encoding and decoding process is provided: (e.g.) Figure 6As shown, the system includes modules such as general coding control, transform and quantization, intra-frame estimation, intra-frame prediction, motion compensation, motion estimation, inverse quantization and inverse transform, filter control analysis, deblocking filtering and SAO filtering (i.e., loop filtering), entropy coding, and a decoding frame buffer. Optionally, the motion compensation module can perform intra-frame / inter-frame selection to determine the specific compensation. Optionally, during entropy coding, the coding bit rate is obtained based on the general control data determined by the general coding control module, the change quantization coefficients determined by the transform and quantization module, the intra-frame prediction data and filter control determined by the filter control analysis, and the motion data determined by the decoding frame buffer.
[0111] Optionally, the decoded video signal can be output through a decoding frame buffer.
[0112] Optionally, the loop filtering may include two branches: a deblocking filter and an LC-NNLF. The results of the branches are then fused and processed by SAO (Sample Adaptive Offset) and ALF (Adaptive Loop Filter).
[0113] Optionally, the use of LC-NNLF (Low Complexity Neural Network-based Loop Filter) in the loop filtering stage leads to excessively high complexity. In this embodiment, a filter lookup table is used in the loop filtering stage to reduce the complexity of the filtering stage and thus the complexity of video encoding and / or decoding.
[0114] Alternatively, similar reference parameters can be found in the second embodiment.
[0115] Optionally, the filtering lookup table can be a type of lookup table. Lookup tables are a common method for accelerating computation in embedded systems. For a complex function or a series of calculations, if the output value is placed in a lookup table, then all that is needed afterward is to retrieve the value, without performing the computation process. Therefore, lookup tables are effective when the computation time is longer than the memory access time. Optionally, the filtering lookup table may include at least one pixel value before filtering, at least one pixel value after filtering (e.g., the filtered pixel value of the first filtered pixel), and the correspondence between the two.
[0116] Optionally, the filtering lookup table may be a lookup table that includes filtered pixels. Optionally, the filtering lookup table may include the correspondence between pixels, the correspondence between pixels to be filtered and filtered pixels, the correspondence between representation values and filtered pixels, and the correspondence between intermediate values and filtered pixels.
[0117] Optionally, each data cell in the filter lookup table is a filtered pixel value.
[0118] Optionally, an index can be set in the filter lookup table so that the corresponding filtered pixel value can be determined by searching the filter lookup table based on the index.
[0119] Optionally, the target image patch may be a filtered image patch. Optionally, the target image patch may include at least one first filtered pixel, or pixels associated with at least one first filtered pixel.
[0120] Optionally, the processing device can be a decoding end, and if at the decoding end, the target image block can be a decoded image block. Optionally, the processing device can be an encoding end, and if at the encoding end, the target image block can be an image block that has undergone filtering processing.
[0121] Optionally, in this embodiment, the pre-trained neural network can be converted into a filter lookup table, and then the target image block after filtering can be determined according to the filter lookup table. This further reduces the complexity of the neural network video compression method while maintaining essentially no performance loss, and / or provides better compression performance compared to non-neural network video compression. And / or because it uses a filter lookup table, it does not require GPU inference, enabling widespread application in various devices such as mobile phones, televisions, and cameras.
[0122] Optionally, the filtering process described in this embodiment can be used for loop filtering, post-filtering, and other stages that may require filtering.
[0123] Optionally, before using a filter lookup table to determine or obtain the filtered target image block, the following four stages can be performed, and then at least one first filter pixel can be determined by searching in at least one filter lookup table based on at least one pixel to be filtered, and the target image block can be determined based on at least one first filter pixel.
[0124] Optionally, the four stages are:
[0125] (1) Training phase: Train a neural network;
[0126] (2) For the trained neural network, perform bit depth sampling on the input pixel values, and convert the sampled values and the output of the neural network into a filter lookup table;
[0127] (3) Fine-tune the filter lookup table to obtain all input and output results at full bit depth;
[0128] (4) Use the filter lookup table for reasoning test. Determine the index of the filter lookup table based on the pixel to be filtered, input the index into the filter lookup table for search, and obtain the filtered pixel after filtering.
[0129] Optionally, during the training phase of the neural network, either symmetric or asymmetric convolutional kernels can be used for training. A corresponding filter lookup table can be constructed for each convolutional kernel in the neural network. The filter lookup tables for each convolutional kernel can be the same or different.
[0130] Alternatively, when training a neural network, multiple convolutional layers can be stacked in the neural network, and the size of the receptive field of the convolutional kernel of each convolutional layer is directly related to the size of the filter lookup table that needs to be transformed later.
[0131] Table 1 shows the filter lookup table sizes for different receptive fields, where n is a positive integer greater than 1, such as 10. As shown in Table 1, for an 8-bit input pixel, there are 256 possible pixel values. Assuming the receptive field of the convolution kernel is 2x2, then there are 256... 4 There are 256 possible combinations of pixel values, therefore the size of the filter lookup table corresponding to the convolution kernel of this convolutional layer is 256. 4 Bytes, meaning 4GB of storage is required for the filter lookup table.
[0132] Table 1. Size of the filter lookup table for different receptive fields
[0133] receptive field of convolution kernel Filtered lookup table Filter lookup table size 1 pixel 1D 256B 2 pixels 2D 32KB 3 pixels 3D 16MB 4 pixels 4D 4GB 5 pixels 5D 1TB n pixels nD <![CDATA[(2 8 ) n B]]>
[0134] Optionally, when training the neural network, asymmetric convolutional kernels can be used for the convolutional layers. For example, ... Figure 7 The image shows the convolution templates corresponding to three different types of 1x3 receptive field convolution kernels, including pixels I0 to I8. Figure 7 The convolution template corresponding to (a) is the selected pixel I0, pixel I1 and pixel I2. Figure 7 The convolution template corresponding to (b) in the diagram is the selected pixel I0, pixel I4, and pixel I5. Figure 7 The convolution template corresponding to (c) is the selected pixel I0, pixel I7 and pixel I8.
[0135] When training a neural network, the input image patch can be rotated, causing the 1x3 convolutional kernel to act on different positions within a 3x3 pixel matrix. This is approximating the convolution of the 1x3 kernel with the 3x3 image patch, thus increasing the receptive field of the 1x3 kernel. M convolutional layers with different templates are cascaded to form a deep neural network. This network is trained until convergence, resulting in a filtered and enhanced neural network. M is a positive integer greater than 1.
[0136] Optionally, since the filtering-enhanced neural network uses M convolutional kernels with different convolutional templates during the training phase, each convolutional template corresponds to a certain input-output relationship, and this relationship is converted into a filtering lookup table for storage. That is, all possible inputs and outputs of the M different convolutional kernels are stored as lookup tables, resulting in the filtering lookup table. For example, for M different convolutional kernels with a receptive field of 3, M filtering lookup tables need to be stored.
[0137] Alternatively, bit-plane sampling can be performed on the input values fed into the neural network (such as pixels to be filtered in an input image patch). For example, for 8-bit input pixels, a 1x3 convolutional kernel covers 256 pixel values. 3 There are 163 possible combinations of pixel values. Therefore, the input image patch can be sampled according to a sampling interval, such as 4 bits. For example, for 256 pixel values in the range [0,1,2,...,255], the pixel values after sampling at 4-bit intervals are [0,16,32,48,64,80,96,112,128,144,160,176,192,208,224,240]. There are 163 possible pixel values after sampling. The filter lookup table can store only the values of the sampled points and their corresponding outputs; in this case, each table is 4KB in size. For example, as shown in Table 2 below.
[0138] Table 2
[0139]
[0140]
[0141] Optionally, after obtaining the filter lookup table based on the trained neural network, the filter lookup table can be used for filtering during the filtering stage of video encoding and / or video decoding.
[0142] For example, such as Figure 8As shown, if the pixels to be filtered in at least one image block are pixels I0, I1, and I2, then the indices corresponding to pixels I0, I1, and I2 can be determined and input into at least one filtering lookup table (such as a multi-level filtering lookup table) for filtering. If the multi-level filtering lookup table consists of filtering lookup table 1 to filtering lookup table 9, then the indices can be input into filtering lookup table 1 for searching, and the output of filtering lookup table 1 can be input into filtering lookup table 2. Filtering is then performed sequentially until filtering lookup table 9 outputs the filtered pixel corresponding to the pixel to be filtered, and the target image block V is determined based on the filtered pixel.
[0143] Optionally, the output of the previous level filter lookup table in a multi-level filter lookup table can be the input of the next level filter lookup table.
[0144] Optionally, each level of the multi-level filtering lookup table can include a correspondence between pixels, and each correspondence can be assigned an index. For example, filtering lookup table 1 stores pixels to be filtered. and filtered pixels or median values [V] (2) The table establishes the correspondence between pixels and sets a corresponding index for each correspondence. For example, the filter lookup table 9 stores the pixels to be filtered. The correspondence between the filtered pixel or the intermediate value [V] is defined, and a corresponding index is set for each correspondence.
[0145] Optionally, when constructing a multi-level filter lookup table, a neural network can be used for processing. The inputs and outputs of the convolutional layers within the convolutional modules of the neural network are stored in the form of lookup tables to obtain the filter lookup table. The filter lookup tables are then aggregated according to the order of each convolutional layer to obtain a multi-level filter lookup table.
[0146] Optionally, each convolutional layer can perform convolution processing according to a convolution template. For example, one convolutional layer can perform convolution processing according to convolution template 1, and another convolutional layer can perform convolution processing according to convolution template 9, until the output is the image block V after neural network filtering.
[0147] Optionally, since different convolutional layers in a neural network correspond to different filter lookup tables, in this embodiment, at least one filter lookup table to be applied can be determined from multiple filter lookup tables based on the reference parameters of at least one image block. The reference parameters of at least one image block are input into the determined filter lookup table for lookup to determine at least the corresponding filtered pixels and output them to obtain a target image block including the filtered pixels. At this time, the target image block is an image block after filtering processing by the filter lookup table.
[0148] In this embodiment, the target image block is determined or generated based on reference parameters of at least one image block and a filter lookup table. This reduces the complexity of the filter when filtering at least one image block using the filter lookup table; and / or, since the lookup table can be constructed based on a neural network, the filtering effect can be guaranteed, while avoiding the direct use of a neural network for filtering, thereby improving the efficiency of video encoding and / or decoding.
[0149] Second Embodiment
[0150] Based on the first embodiment, a second embodiment is proposed.
[0151] In this embodiment, the reference parameters include at least one of the following methods one through thirteen:
[0152] Method 1: Quantify parameters;
[0153] Optionally, the quantization parameters may include the quantization step size. The quantization step size can be a parameter calculated based on the quantization parameter QP that controls the quantization precision. Optionally, the quantization step size can determine the degree of precision loss in the process of dividing the original signal (such as pixel values or transform coefficients) from continuous values into multiple value intervals. The quantization process maps continuous values to a smaller number of discrete values, achieving data compression by discarding some details. The larger the quantization step size, the greater the difference between each quantization level, and therefore the more details are discarded. This leads to increased compression efficiency but also introduces greater distortion, i.e., a decrease in image quality. Conversely, the smaller the quantization step size, the more details are retained, resulting in higher image quality, but the required storage space and transmission bandwidth also increase accordingly.
[0154] Optionally, the filter lookup table to be applied can be selected from multiple filter lookup tables based on the quantization parameters of at least one image block, and the pixels to be filtered in at least one image block can be input into the filter lookup table to be applied for filtering, and the filtered target image block can be output.
[0155] Optionally, the pixels to be filtered in at least one image block can be converted into indices, and these indices can be entered into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image block can then be constructed and determined based on the output filtered pixels. Alternatively, the indices can be determined based on quantization parameters and the pixels to be filtered in at least one image block. For example, the index corresponding to the pixel to be filtered in at least one image block can be determined from a table containing the mapping relationship between the pixels to be filtered and the indices. The quantization parameters can then be used to determine whether the index needs to be updated (e.g., if the quantization parameters are greater than a preset quantization parameter threshold, it is determined that the index needs to be updated). If an update is required, the index can be updated (e.g., by increasing or decreasing a preset index value), and then entered into the filter lookup table for searching and output to obtain the target image block. Optionally, the target image block includes filtered pixels.
[0156] In this embodiment, when the reference parameters include quantization parameters, the target image block is determined or generated based on the quantization parameters and the filter lookup table, thereby avoiding the direct use of neural networks for filtering and improving the efficiency of video encoding and / or decoding.
[0157] Method 2, boundary strength;
[0158] Optionally, the boundary strength can be a measure of the sharpness or intensity of the edges between adjacent image patches. Optionally, the boundary region pixels can be determined as true boundaries or pseudo-boundaries such as compression-induced pseudo-contours based on the boundary strength of the current image patch to be filtered.
[0159] Optionally, the filter lookup table to be applied can be selected from multiple filter lookup tables based on the boundary strength of at least one image block, and the pixels to be filtered in at least one image block can be input into the filter lookup table to be applied for filtering, and the filtered target image block can be output.
[0160] The pixels to be filtered in at least one image patch can be converted into indices, and these indices can be entered into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image patch can then be constructed and determined based on the output filtered pixels. Optionally, the indices can be determined based on boundary strength and the pixels to be filtered in at least one image patch. For example, the index corresponding to the pixel to be filtered in at least one image patch can be determined from a table containing the mapping relationship between the pixels to be filtered and the indices. The boundary strength can then be used to determine whether the index needs to be updated (e.g., if the boundary strength is greater than a preset boundary strength threshold). If an update is needed, the index can be updated (e.g., by increasing or decreasing a preset index value), and then entered into the filter lookup table for searching and output to obtain the target image patch.
[0161] In this embodiment, by determining or generating target image blocks based on boundary strength and a filter lookup table when the reference parameters include boundary strength, the direct use of neural networks for filtering can be avoided, thereby improving the efficiency of video encoding and / or decoding.
[0162] Method 3: Characterization parameters;
[0163] Optionally, the representation parameters can be the various parameters corresponding to the representation values, such as the representation interval, which can be the interval between representation values. Optionally, the representation value can be a numerical value used to describe or represent the characteristics, attributes, etc., of an object, phenomenon, etc., and often undergoes certain processing or selection to reflect key characteristics, such as using average scores to represent the learning level of a class. Optionally, the representation value can be a sampled value. When the representation value is a sampled value, the representation interval can be the sampling interval. Optionally, the sampled value can be a pixel value.
[0164] Optionally, the following examples only use the characterization parameters as characterization intervals.
[0165] Optionally, the filter lookup table to be applied can be selected from multiple filter lookup tables based on the representation interval of at least one image block, and the pixels to be filtered in at least one image block can be input into the filter lookup table to be applied for filtering, and the filtered target image block can be output.
[0166] Optionally, when the characterization parameter is the sampling interval, the filter lookup table to be applied can be selected from multiple filter lookup tables based on the sampling interval of at least one image block, and the pixels to be filtered in at least one image block are input into the filter lookup table to be applied for filtering, and the filtered target image block is output.
[0167] Optionally, the pixels to be filtered in at least one image patch can be converted into indices, and these indices can be entered into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image patch can then be constructed and determined based on the output filtered pixels. Alternatively, the indices can be determined based on the representation interval and the pixels to be filtered in at least one image patch. For example, the index corresponding to the pixels to be filtered in at least one image patch can be determined from a table containing the mapping relationship between the pixels to be filtered and the indices. Whether the index needs to be updated can be determined based on the representation interval (e.g., if the representation interval is greater than a preset representation interval threshold). If an update is needed, the index can be updated (e.g., by increasing or decreasing a preset index value), and then entered into the filter lookup table for searching and output to obtain the target image patch.
[0168] In this embodiment, by determining or generating target image blocks based on representation parameters and a filter lookup table, the direct use of neural networks for filtering can be avoided, thereby improving the efficiency of video encoding and / or decoding.
[0169] Method 4: Position information of the pixels to be filtered;
[0170] Optionally, the pixel to be filtered can be a pixel in at least one image block that is to be filtered.
[0171] Optionally, the position information of the pixel to be filtered may include the pixel position of the pixel to be filtered within the image block to be filtered. It may also include the positional relationship between the pixel to be filtered and a preset pixel within the image block. Optionally, the preset pixel may be the middle pixel of the image block, the top-left pixel of the image block, or a pixel within the image block determined by the user, etc. Optionally, the positional relationship may include the distance between the pixel to be filtered and the preset pixel, such as a 5-pixel interval, a 3-pixel interval, etc. It may also include the directional relationship between the pixel to be filtered and the preset pixel, such as the pixel to be filtered being located to the left of the preset pixel, or to the right of the preset pixel, etc.
[0172] Optionally, the filter lookup table to be applied can be selected from multiple filter lookup tables based on the position information of the pixels to be filtered in at least one image block, and the pixels to be filtered in at least one image block can be input into the filter lookup table to be applied for filtering, and the filtered target image block can be output.
[0173] The pixels to be filtered in at least one image patch can be converted into indices, and these indices can be entered into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image patch can then be constructed and determined based on the output filtered pixels. Optionally, the index can be determined based on the position information of the pixels to be filtered and the pixels to be filtered in at least one image patch. For example, the index corresponding to the pixel to be filtered in at least one image patch can be determined from a table containing the mapping relationship between the pixels to be filtered and the indices, and the position information of the pixel to be filtered can be used to determine whether the index needs to be updated. For example, if the position information of the pixel to be filtered meets a preset position condition (e.g., the number of pixels between the pixel to be filtered and a preset pixel is greater than a preset pixel number threshold, and / or the pixel to be filtered is located to the left and / or to the right of a preset pixel, etc.), it is determined that the index needs to be updated. If an update is required, the index can be updated (e.g., by increasing or decreasing a preset index value), and then entered into the filter lookup table for searching and output to obtain the target image patch.
[0174] In this embodiment, by determining or generating the target image block based on the position information of the pixel to be filtered and the filter lookup table when the reference parameters include the position information of the pixel to be filtered, the direct use of neural networks for filtering can be avoided, thereby improving the efficiency of video encoding and / or decoding.
[0175] Method 5: Pixel value to be filtered;
[0176] Optionally, at least one pixel in an image block to be filtered can be identified, and its pixel value can be input into a filter lookup table for searching. The filter pixel corresponding to the pixel to be filtered in the lookup table is then determined and output. Based on the output filter pixel, a target image block is determined or generated. That is, the target image block includes at least one filter pixel corresponding to the pixel to be filtered.
[0177] Optionally, the pixel to be filtered can be a predicted pixel (e.g., pixel prediction), a reconstructed pixel (e.g., pixel reconstruction), or a reference pixel. Optionally, the predicted pixel, reconstructed pixel, and / or reference pixel can be intra-frame pixels and / or inter-frame pixels. The reference pixel can be an intermediate value used or generated when the predicted pixel and / or reconstructed pixel are used; no limitation is imposed here.
[0178] Optionally, at least one pixel value to be filtered can be converted into at least one index, and the at least one index can be input into at least one filter lookup table for lookup to determine the corresponding filter pixel in the filter lookup table, and the filter pixel can be output. The target image block can be determined or generated based on the output filter pixel.
[0179] In this embodiment, when the reference parameters include the pixel value to be filtered, the target image block is determined or generated based on the pixel value to be filtered and the filter lookup table, thereby avoiding the direct use of neural networks for filtering and improving the efficiency of video encoding and / or decoding.
[0180] Method 6: Image block size information;
[0181] Optionally, the size information of the image patch may include the width, height, area, and perimeter of the image patch.
[0182] Optionally, at least one filter lookup table to be applied can be selected from multiple filter lookup tables based on the size information of at least one image block, and the pixels to be filtered in at least one image block can be input into the at least one filter lookup table to be applied for filtering, and the filtered target image block can be output.
[0183] Optionally, the pixels to be filtered in at least one image block can be converted into indices, and these indices can be entered into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image block can then be constructed and determined based on the output filtered pixels. Alternatively, the indices can be determined based on the size information of at least one image block and the pixels to be filtered within that block. For example, the index corresponding to the pixel to be filtered in the at least one image block can be determined from a table containing the mapping relationship between the pixels to be filtered and the indices. The size information of the at least one image block can then be used to determine whether the index needs to be updated. For instance, if the size information of the at least one image block meets a preset size condition (e.g., if the size information of the at least one image block is greater than a preset size information threshold, then the preset size condition is met), it is determined that the index needs to be updated. If an update is required, the index can be updated (e.g., by increasing or decreasing a preset index value), then entered into the filter lookup table for searching and output to obtain the target image block.
[0184] Optionally, the image patch can be filtered in multiple filter lookup tables based on its image patch attributes and / or image patch type. The at least one image patch is then input into the filtered filter lookup table for filtering, and the target image patch is output. Optionally, the image patch attributes may include the image texture of the image patch. The image patch type includes natural images or screen content images.
[0185] In this embodiment, by determining or generating the target image block based on the image block size information and the filter lookup table when the reference parameters include the image block size information, the direct use of neural networks for filtering can be avoided, thereby improving the efficiency of video encoding and / or decoding.
[0186] Method 7: Filtering information from neighboring blocks;
[0187] Optionally, a neighboring block can be an image block adjacent to at least one image block, and can be an image block that has already undergone wave filtering.
[0188] Optionally, the filtering information of neighboring blocks may include the filtered pixels of the neighboring blocks, the filtering lookup table used by the neighboring blocks when performing filtering processing, etc.
[0189] Optionally, at least one filter lookup table to be applied can be selected from multiple filter lookup tables based on the filtering information of the neighboring blocks of at least one image block, and the pixels to be filtered in at least one image block can be input into the at least one filter lookup table to be applied for filtering, and the filtered target image block can be output.
[0190] Optionally, the pixels to be filtered in at least one image block can be converted into indices, and these indices can be entered into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image block is then constructed and determined based on the output filtered pixels. Alternatively, the indices can be determined based on the filtering information of neighboring blocks and the pixels to be filtered in at least one image block. For example, the index corresponding to the pixel to be filtered in at least one image block can be determined in a table containing the mapping relationship between the pixels to be filtered and indices. Simultaneously, the selected filter lookup table can be determined from multiple filter lookup tables based on the filtering information of neighboring blocks. For example, the filter lookup table used by neighboring blocks can be used as the selected filter lookup table. The index is entered into the selected filter lookup table for searching, and the corresponding filtered pixels are output. The target image block is then determined or generated based on the filtered pixels. For example, the target image block includes the output filtered pixels.
[0191] Optionally, the image block can be filtered in multiple filter lookup tables based on the image block attributes and / or image block type of at least one neighboring block, and then at least one image block can be input into the filtered filter lookup table for filtering processing, and the target image block can be output.
[0192] In this embodiment, by determining or generating the target image block based on the filtering information of the neighboring blocks and the filtering lookup table when the reference parameters include the filtering information of the neighboring blocks, the direct use of neural networks for filtering can be avoided, thereby improving the efficiency of video encoding and / or decoding.
[0193] Method 8: Filtering information for non-neighboring blocks;
[0194] Optionally, a non-neighbor block can be an image block that is not adjacent to at least one image block, and can be an image block that has already undergone wave filtering.
[0195] Optionally, the filtering information for non-neighboring blocks may include the filtered pixels of the non-neighboring blocks, the filtering lookup table used by the non-neighboring blocks during filtering, etc.
[0196] Optionally, at least one filter lookup table to be applied can be selected from multiple filter lookup tables based on the filtering information of the non-neighboring blocks of at least one image block, and the pixels to be filtered in at least one image block can be input into the at least one filter lookup table to be applied for filtering, and the filtered target image block can be output.
[0197] Optionally, the pixels to be filtered in at least one image block can be converted into indices, and these indices can be entered into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image block is then constructed and determined based on the output filtered pixels. Alternatively, the indices can be determined based on the filtering information of non-neighboring blocks and the pixels to be filtered in at least one image block. For example, the index corresponding to the pixel to be filtered in at least one image block can be determined from a table containing the mapping relationship between the pixels to be filtered and the indices. Simultaneously, the selected filter lookup table can be determined from multiple filter lookup tables based on the filtering information of non-neighboring blocks. For example, the filter lookup table used by the non-neighboring blocks can be used as the selected filter lookup table. The index is entered into the selected filter lookup table for searching, and the corresponding filtered pixels are output. The target image block including the filtered pixels is then determined or generated based on the filtered pixels.
[0198] Optionally, the image block can be filtered in multiple filter lookup tables based on the image block attributes and / or image block type of at least one non-neighboring block, and then at least one image block can be input into the filtered filter lookup table for filtering processing, and the target image block can be output.
[0199] In this embodiment, when the reference parameters include filtering information of non-neighboring blocks, the target image block is determined or generated based on the filtering information of non-neighboring blocks and the filter lookup table, thereby avoiding the direct use of neural networks for filtering processing and improving the efficiency of video encoding and / or decoding.
[0200] Method 9: Filtering information across component blocks;
[0201] Optionally, the cross-component block can be an image block that is in a different component from the current at least one image block. For example, if the current at least one image block is an image block of the Y component, then the cross-component block can be an image block of the U component and / or V component. If the current at least one image block is an image block of the U component, then the cross-component block can be an image block of the Y component and / or V component. If the current at least one image block is an image block of the V component, then the cross-component block can be an image block of the U component and / or Y component.
[0202] Optionally, the filtering information across component blocks may include parameter information that needs to be applied when performing filtering processing across component blocks.
[0203] Optionally, at least one filter lookup table to be applied can be selected from multiple filter lookup tables based on the cross-component block filtering information of at least one image block, and the pixels to be filtered in at least one image block are input into the at least one filter lookup table to be applied for filtering, and the filtered target image block is output.
[0204] Optionally, the pixels to be filtered in at least one image block can be converted into indices, and these indices can be input into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image block is then constructed and determined based on the output filtered pixels. Alternatively, the indices can be determined based on the filtering information across component blocks and the pixels to be filtered in at least one image block. For example, the index corresponding to the pixel to be filtered in at least one image block can be determined in a table containing the mapping relationship between the pixels to be filtered and the indices. Simultaneously, the selected filter lookup table is determined from multiple filter lookup tables based on the filtering information across component blocks. For instance, when the filtering information across component blocks meets certain conditions, at least one type of filter lookup table is selected; when the filtering information across component blocks meets another condition, another type of filter lookup table is selected. The two types of filter lookup tables can be different. The index is then input into the selected filter lookup table for searching, and the corresponding filtered pixels are output. The target image block including the filtered pixels is then determined or generated based on these filtered pixels.
[0205] Optionally, the image block can be filtered in multiple filter lookup tables based on the image block attributes and / or image block type of at least one spanning component block, and then at least one image block can be input into the filtered filter lookup table for filtering processing, and the target image block can be output.
[0206] In this embodiment, by determining or generating target image blocks based on cross-component block filtering information and a filter lookup table when the reference parameters include cross-component block filtering information, the direct use of neural networks for filtering can be avoided, thereby improving the efficiency of video encoding and / or decoding.
[0207] Method 10: Filtering information of the same block;
[0208] Optionally, the co-location block can be an image block in the co-location image that has the same position and size as the current block. Optionally, the co-location image can be the image in the reference image that is closest to the current image in time.
[0209] Optionally, the filtering information of the co-position block may include the filtering lookup table used by the co-position block during filtering, and may also include the size information, position information, etc. of the co-position block.
[0210] Optionally, at least one filter lookup table to be applied can be selected from multiple filter lookup tables based on the filtering information of the co-position blocks of at least one image block, and the pixels to be filtered in at least one image block are input into the at least one filter lookup table to be applied for filtering, and the filtered target image block is output.
[0211] Optionally, the pixels to be filtered in at least one image block can be converted into indices, and these indices can be input into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image block is then constructed and determined based on the output filtered pixels. Alternatively, the indices can be determined based on the filtering information of the corresponding blocks and the pixels to be filtered in at least one image block. For example, the index corresponding to the pixel to be filtered in at least one image block can be determined from a table containing the mapping relationship between the pixels to be filtered and the indices. Simultaneously, the selected filter lookup table can be determined from multiple filter lookup tables based on the filtering information of the corresponding blocks. For example, the filter lookup table used by the corresponding blocks can be used as the selected filter lookup table. The index is input into the selected filter lookup table for searching, and the corresponding filtered pixels are output. The target image block including the filtered pixels is then determined or generated based on the filtered pixels.
[0212] Optionally, the image block can be filtered in multiple filter lookup tables based on the image block attributes and / or image block type of at least one co-located block, and then at least one image block can be input into the filtered filter lookup table for filtering processing, and the target image block can be output.
[0213] In this embodiment, by determining or generating the target image block based on the filtering information of the co-position block and the filter lookup table when the reference parameters include the filtering information of the co-position block, the direct use of neural networks for filtering can be avoided, thereby improving the efficiency of video encoding and / or decoding.
[0214] Method 11: Filtering information for time-domain blocks;
[0215] Optionally, the temporal block can be a block that is distinguished in the time domain, such as an image block in the previous frame. For example, if there is video data containing three frames of images, the first frame is played in the first second, the second frame is played in the second second, and the third frame is played in the third second, if the image block predicted at the current moment is an image block after dividing the second frame, then the temporal block can be determined to be the image block corresponding to it in the other frames of images besides the second frame.
[0216] Optionally, the filtering information of the time-domain block may include the filtering lookup table used when the time-domain block is filtered, and may also include the size information, position information, etc. of the time-domain block.
[0217] Optionally, at least one filter lookup table to be applied can be selected from multiple filter lookup tables based on the filtering information of the temporal block of at least one image block, and the pixels to be filtered in at least one image block can be input into the at least one filter lookup table to be applied for filtering, and the filtered target image block can be output.
[0218] Optionally, the pixels to be filtered in at least one image block can be converted into indices, and these indices can be entered into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image block is then constructed and determined based on the output filtered pixels. Alternatively, the indices can be determined based on the filtering information of the temporal block and the pixels to be filtered in at least one image block. For example, the index corresponding to the pixel to be filtered in at least one image block can be determined from a table containing the mapping relationship between the pixels to be filtered and the indices. Simultaneously, the selected filter lookup table can be determined from multiple filter lookup tables based on the filtering information of the temporal block. For example, the filter lookup table used in the temporal block can be used as the selected filter lookup table. The index is entered into the selected filter lookup table for searching, and the corresponding filtered pixels are output. The target image block including the filtered pixels is then determined or generated based on the filtered pixels.
[0219] Optionally, the image block can be filtered in multiple filter lookup tables based on the image block attributes and / or image block type of at least one time-domain block, and then the at least one image block can be input into the filtered filter lookup table for filtering processing, and the target image block can be output.
[0220] In this embodiment, by determining or generating the target image block based on the filtering information of the temporal block and the filter lookup table when the reference parameters include the filtering information of the temporal block, the direct use of neural networks for filtering can be avoided, thereby improving the efficiency of video encoding and / or decoding.
[0221] Method 12, Default block filtering information;
[0222] Optionally, the default block can be a pre-set image block, such as an image block with typical pixel characteristics pre-set by the encoder and / or decoder.
[0223] Optionally, the filtering information of the default block may include the filtering lookup table used by the default block when performing filtering processing, and may also include the size information, position information, etc. of the default block.
[0224] Optionally, at least one filter lookup table to be applied can be selected from multiple filter lookup tables based on the filter information of the default block, and the pixels to be filtered in at least one image block can be input into the at least one filter lookup table to be applied for filtering, and the filtered target image block can be output.
[0225] Optionally, the pixels to be filtered in at least one image block can be converted into indices, and these indices can be entered into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image block is then constructed and determined based on the output filtered pixels. Alternatively, the indices can be determined based on the filtering information of a default block and the pixels to be filtered in at least one image block. For example, the index corresponding to the pixel to be filtered in at least one image block can be determined from a table containing the mapping relationship between the pixels to be filtered and the indices. Simultaneously, the selected filter lookup table can be determined from multiple filter lookup tables based on the filtering information of the default block. For example, the filter lookup table used by the default block can be used as the selected filter lookup table. The index is entered into the selected filter lookup table for searching, and the corresponding filtered pixels are output. The target image block including the filtered pixels is then determined or generated based on the filtered pixels.
[0226] Optionally, the image block can be filtered in multiple filter lookup tables based on the image block attributes and / or image block type of at least one default block, and then at least one image block can be input into the filtered filter lookup table for filtering processing, and the target image block can be output.
[0227] In this embodiment, by determining or generating the target image block based on the filtering information of the default block and the filter lookup table when the reference parameters include the filtering information of the default block, the direct use of neural networks for filtering can be avoided, thereby improving the efficiency of video encoding and / or decoding.
[0228] Method 13: Filtering information for candidate blocks.
[0229] Optionally, the candidate block can be an image block to be selected.
[0230] Optionally, for the candidate block in method thirteen, the candidate block is determined by motion vector and / or block vector. Optionally, at least one candidate block can be determined or obtained based on the motion vector and / or block vector corresponding to at least one pixel to be filtered in at least one image block.
[0231] Optionally, the filtering information of the candidate block may include the filtering lookup table used when the candidate block is filtered, and may also include the size information, position information, etc. of the candidate block.
[0232] Optionally, at least one filter lookup table to be applied can be selected from multiple filter lookup tables based on the filtering information of the candidate block, and the pixels to be filtered in at least one image block can be input into the at least one filter lookup table to be applied for filtering, and the filtered target image block can be output.
[0233] Optionally, the pixels to be filtered in at least one image block can be converted into indices, and these indices can be entered into a filter lookup table for searching to determine the corresponding filtered pixels for output. The target image block is then constructed and determined based on the output filtered pixels. Alternatively, the indices can be determined based on the filtering information of the candidate blocks and the pixels to be filtered in at least one image block. For example, the index corresponding to the pixel to be filtered in at least one image block can be determined from a table containing the mapping relationship between the pixels to be filtered and the indices. Simultaneously, the selected filter lookup table can be determined from multiple filter lookup tables based on the filtering information of the candidate blocks. For example, the filter lookup table used by the candidate blocks can be used as the selected filter lookup table. The index is entered into the selected filter lookup table for searching, and the corresponding filtered pixels are output. The target image block including the filtered pixels is then determined or generated based on the filtered pixels.
[0234] Optionally, the image block can be filtered in multiple filter lookup tables based on the image block attributes and / or image block type of at least one candidate block, and then the at least one image block can be input into the filtered filter lookup table for filtering processing, and the target image block can be output.
[0235] In this embodiment, by determining or generating the target image block based on the filtering information of the candidate block and the filter lookup table when the reference parameters include the filtering information of the candidate block, the direct use of neural networks for filtering can be avoided, thereby improving the efficiency of video encoding and / or decoding.
[0236] Third Embodiment
[0237] Based on the first or second embodiment, a third embodiment is proposed.
[0238] In this embodiment, step S10 includes at least one of the following methods fourteen to eighteen.
[0239] Method 14: Determine or generate at least one first intermediate value based on a neural network and at least one reference parameter, and determine or generate a target image patch based on at least one first intermediate value and at least one filter lookup table;
[0240] Optionally, the first intermediate value can be data generated during an intermediate processing step in the filtering process. It can be a filtered pixel generated after at least one filtering process, or it can be other values. The intermediate processing step can be a process of performing pre-filtering processing on the pixel to be filtered (such as filtering through a neural network or a filter lookup table).
[0241] Optionally, at least one reference parameter can be input into the neural network for filtering, and the output can be a first intermediate value. For example, at least one of the following can be input into the neural network for filtering: quantization parameter, boundary strength, representation interval, position information of the pixel to be filtered, pixel value to be filtered, size information of the image block, filtering information of neighboring blocks, filtering information of non-neighboring blocks, filtering information of cross-component blocks, filtering information of co-position blocks, filtering information of temporal blocks, filtering information of default blocks, and filtering information of candidate blocks, and the output can be a first intermediate value.
[0242] Optionally, the first intermediate value can be directly used as the filtered pixel, and the target image block can be determined or generated based on the filtered pixel.
[0243] Optionally, the first intermediate value can be processed to determine or generate a target image block. For example, the first intermediate value can be input into at least one filter lookup table to determine the filtered pixel, and the target image block can be determined or generated based on the at least one filtered pixel. Optionally, the filter lookup table may include a correspondence between the first intermediate value and the filtered pixel.
[0244] Optionally, at least one filter lookup table can be a multi-level filter lookup table. The first intermediate value can be converted into an index and input into the multi-level filter lookup table for parallel or serial lookup, the output of which is the filtered pixel, and the target image block is determined or generated based on at least one filtered pixel.
[0245] In this embodiment, by determining or generating at least one first intermediate value based on a neural network and at least one reference parameter, and by determining or generating a target image block based on the at least one first intermediate value and at least one filter lookup table, the filtering effect can be improved by using the neural network and the filter lookup table together for filtering processing, thereby improving the effect of video encoding and / or decoding.
[0246] Method 15: Search in at least one filter lookup table based on at least one reference parameter to obtain at least one second intermediate value, and determine or generate the target image patch based on the neural network and at least one second intermediate value;
[0247] Optionally, the second intermediate value can be data generated during an intermediate processing step in the filtering process, such as filtered pixels generated after at least one filtering process, or other values. Optionally, the second intermediate value can be the same as or different from the first intermediate value.
[0248] Optionally, at least one reference parameter can be used to determine the selected filter lookup table among multiple filter lookup tables, and the at least one reference parameter can be input into the at least one filter lookup table to find and obtain at least one second intermediate value. Optionally, the filter lookup table includes a correspondence between the at least one reference parameter and the second intermediate value, and an index can be set for each correspondence in the filter lookup table.
[0249] Optionally, at least one filtering lookup table can be a multi-level filtering lookup table. The pixel values to be filtered in at least one image block can be converted into an index and input into the multi-level filtering lookup table for parallel or serial lookup, outputting at least one second intermediate value. Optionally, the index of the pixel values to be filtered in at least one image block can be updated (e.g., the index value can be increased or decreased) based on at least one of the following: quantization parameters, boundary strength, characterization parameters, the position information of the pixel to be filtered, the size information of the image block, filtering information of neighboring blocks, filtering information of non-neighboring blocks, filtering information across component blocks, filtering information of co-located blocks, filtering information of temporal blocks, filtering information of default blocks, and filtering information of candidate blocks. The updated index is then input into the multi-level filtering lookup table for parallel or serial lookup, outputting at least one second intermediate value.
[0250] Optionally, a filter lookup table can be used to filter the pixel based on at least one of the following: quantization parameters, boundary strength, characterization interval, position information of the pixel to be filtered, pixel value to be filtered, size information of the image block, filtering information of neighboring blocks, filtering information of non-neighboring blocks, filtering information of cross-component blocks, filtering information of co-position blocks, filtering information of temporal blocks, filtering information of default blocks, and filtering information of candidate blocks. The pixel value to be filtered can be converted into an index and input into the filtered filter lookup table for lookup to determine the filtered pixel after filtering, and output it as the second intermediate value.
[0251] Optionally, at least one second intermediate value can be input into the neural network to obtain filtered pixels as output, and the target image block can be determined or generated based on the filtered pixels.
[0252] Alternatively, the neural network may be determined based on at least one reference parameter.
[0253] Alternatively, an image block including at least one second intermediate value can be input into a neural network for filtering, and the output can be a target image block including the filtered pixels.
[0254] In this embodiment, at least one second intermediate value is obtained by searching at least one filter lookup table based on at least one reference parameter, and a target image block is determined or generated based on the neural network and at least one second intermediate value. The filtering effect can be improved by using the neural network and the filter lookup table together for filtering processing, thereby improving the effect of video encoding and / or decoding.
[0255] Method 16: Determine or generate at least one index based on at least one reference parameter, and determine or generate the target image patch based on at least one index and at least one filter lookup table;
[0256] Optionally, at least one index can be determined by at least one reference parameter in at least one image block, and the index can be determined by referring to at least one of methods 31 to 37 in the following embodiments.
[0257] Alternatively, the index can be a number, an array, or an identifier, label, etc.
[0258] Optionally, it can be determined whether the index of the pixel value to be filtered of at least one image block needs to be updated based on at least one of the following: quantization parameters, boundary strength, characterization parameters, position information of the pixel to be filtered, size information of the image block, filtering information of neighboring blocks, filtering information of non-neighboring blocks, filtering information of cross-component blocks, filtering information of co-position blocks, filtering information of temporal blocks, filtering information of default blocks, and filtering information of candidate blocks. If necessary, the converted index can be increased or decreased to obtain the updated index.
[0259] Optionally, after determining at least one index, the at least one index can be input into at least one filter lookup table for lookup to determine the corresponding filter pixel and output it, and the target image block can be determined or generated based on the output at least one filter pixel.
[0260] Optionally, after determining or generating at least one index based on at least one reference parameter, such as a first index, the first index is input into the first layer of the multi-layer filter lookup table for searching to obtain a first search result. Then, a second index is determined or obtained based on the first search result, and the second index is input into the second layer of the multi-layer filter lookup table for searching, until the last layer of the filter lookup table outputs the filtered pixel or the target image block containing the filtered pixel. Optionally, the at least one filter lookup table may include a multi-layer filter lookup table.
[0261] In this embodiment, the target image block is determined or generated based on at least one index determined or generated according to at least one reference parameter, and at least one filter lookup table. By using a filter lookup table for filtering, the complexity of the filtering process can be reduced, thereby improving the efficiency of video encoding and / or decoding.
[0262] Method 17: Determine or generate at least one third intermediate value based on a neural network and at least one reference parameter; determine or generate at least one fourth intermediate value based on at least one third intermediate value and at least one filter lookup table; determine or generate a target image patch based on a neural network and at least one fourth intermediate value.
[0263] Optionally, the third and / or fourth intermediate values can be data generated during an intermediate processing step in the filtering process, or can be filtered pixels generated after at least one filtering process, or can be other values. Optionally, the fourth, and / or third, and / or second, and / or first intermediate values can be the same or different.
[0264] Optionally, the neural network can be determined or filtered based on at least one reference parameter, and the at least one reference parameter can be input into the neural network for filtering processing, outputting a third intermediate value. Alternatively, at least one image patch and its reference parameters can be input into the neural network for filtering processing, outputting an image patch after one filtering process. Pixels in the image patch after one filtering process can be used as the third intermediate value.
[0265] Optionally, at least one third intermediate value can be input into at least one filter lookup table to determine and generate at least one fourth intermediate value. Optionally, at least one third intermediate value can be converted into an index (e.g., directly using the third intermediate value as the index, or transforming the third intermediate value to obtain the index), and then the index can be input into at least one filter lookup table to find and output at least one fourth intermediate value. Optionally, the filter lookup table includes the correspondence between the third and fourth intermediate values, and indexes can be set in the filter lookup table, with each correspondence corresponding to one index.
[0266] Optionally, at least one filter lookup table can be a multi-level filter lookup table. At least one third intermediate value can be input into the multi-level filter lookup table for serial or parallel lookup, and at least one fourth intermediate value can be output.
[0267] Optionally, at least one fourth intermediate value can be input into the neural network for model training, and the output can be a target image patch including at least one filtered pixel.
[0268] In this embodiment, at least one reference parameter is filtered based on the neural network, the filter lookup table, and the architecture of the neural network to determine or generate target image blocks. The filtering effect can be improved by using the neural network and the filter lookup table together, thereby improving the video encoding and / or decoding effect.
[0269] Method 18: Determine or generate at least one fifth intermediate value based on at least one reference parameter and at least one filter lookup table; determine or generate at least one sixth intermediate value based on at least one fifth intermediate value and a neural network; and determine or generate a target image patch based on at least one sixth intermediate value and at least one filter lookup table.
[0270] Optionally, the fifth intermediate value and / or the sixth intermediate value can be data generated during an intermediate processing step in the filtering process, which can be filtered pixels generated after at least one filtering process, or other values. Optionally, the sixth intermediate value, and / or the fifth intermediate value, and / or the fourth intermediate value, and / or the third intermediate value, and / or the second intermediate value, and / or the first intermediate value can be the same or different.
[0271] Optionally, at least one filter lookup table can be selected from multiple filter lookup tables using at least one reference parameter, and the at least one reference parameter can be input into the at least one filter lookup table to search and obtain at least one fifth intermediate value. Optionally, an index can also be determined based on at least one reference parameter, and the index can be input into the at least one filter lookup table to search and determine at least one fifth intermediate value.
[0272] Optionally, the filter lookup table includes a correspondence between at least one reference parameter and a fifth intermediate value, and an index can be set for each correspondence in the filter lookup table.
[0273] Optionally, at least one filter lookup table can be a multi-level filter lookup table. The first reference parameter can be converted into an index input to the multi-level filter lookup table for parallel or serial lookup, and the output will be at least a fifth intermediate value.
[0274] Optionally, at least one fifth intermediate value can be input into the neural network for filtering, and the output can be at least one sixth intermediate value. Alternatively, an image patch including at least one fifth intermediate value can be input into the neural network for filtering, and the output can be an image patch including at least one sixth intermediate value.
[0275] Optionally, at least one sixth intermediate value can be input into at least one filter lookup table to determine the filtered pixel, and the target image block can be determined or generated based on the filtered pixel. Alternatively, at least one sixth intermediate value can be used to determine an index, which can then be input into at least one filter lookup table to determine the filtered pixel.
[0276] Optionally, the filter lookup table may also include at least a sixth intermediate value and a correspondence with the filtered pixel, and an index may be set for each correspondence in the filter lookup table.
[0277] Optionally, at least one sixth intermediate value can be input into a multi-layer filter lookup table for parallel or serial lookup, and the output can be either the filtered pixel or the target image block containing the filtered pixel.
[0278] In this embodiment, at least one reference parameter is filtered according to the architecture of the filter lookup table, the neural network, and the filter lookup table to determine or generate target image blocks. The filtering effect can be improved by using the neural network and the filter lookup table together, thereby improving the effect of video encoding and / or decoding.
[0279] Fourth embodiment
[0280] Based on any of the above embodiments, a fourth embodiment is proposed.
[0281] In this embodiment, the neural network includes at least one of the following methods nineteen to twenty-two.
[0282] Method 19: A neural network based on fully connected layers;
[0283] Method 20: Neural networks based on convolutional layers;
[0284] Method 21: Neural Networks Based on Transformer;
[0285] Method 22 is a neural network based on a hybrid convolutional and fully connected layer and a Transformer.
[0286] Optionally, for method fourteen, it can be at least one of a fully connected layer-based neural network, a convolutional layer-based neural network, a Transformer-based neural network, and a neural network based on a mixture of convolutional, fully connected, and Transformer layers, and at least one reference parameter to determine or generate at least one first intermediate value, and to determine or generate a target image patch based on the at least one first intermediate value and at least one filter lookup table.
[0287] Optionally, for method 15, at least one second intermediate value can be obtained by searching in at least one filter lookup table based on at least one reference parameter, and the target image patch can be determined or generated based on at least one of the following: a neural network based on a fully connected layer, a neural network based on a convolutional layer, a neural network based on a Transformer, and a neural network based on a hybrid convolutional, fully connected, and Transformer layer, and at least one second intermediate value.
[0288] Optionally, for method seventeen, at least one of the following can be used: a neural network based on fully connected layers, a neural network based on convolutional layers, a neural network based on Transformer, and a neural network based on a mixture of convolutional, fully connected, and Transformer layers, along with at least one reference parameter, to determine or generate at least one third intermediate value. Based on the at least one third intermediate value and at least one filter lookup table, at least one fourth intermediate value is then determined or generated. The target image patch is determined or generated based on at least one of the following: a neural network based on fully connected layers, a neural network based on convolutional layers, a neural network based on Transformer, and a neural network based on a mixture of convolutional, fully connected, and Transformer layers, along with at least one fourth intermediate value.
[0289] Optionally, for method eighteen, at least one fifth intermediate value may be determined or generated based on at least one reference parameter and at least one filter lookup table. At least one sixth intermediate value may be determined or generated based on at least one of the following: a neural network based on fully connected layers, a neural network based on convolutional layers, a neural network based on Transformer, and a neural network based on a mixture of convolutional, fully connected, and Transformer layers, and at least one fifth intermediate value; and a target image patch may be determined or generated based on the at least sixth intermediate value and at least one filter lookup table.
[0290] Optionally, at least one of the following can be selected based on at least one reference parameter: a neural network based on fully connected layers, a neural network based on convolutional layers, a neural network based on Transformers, and a neural network based on a mixture of convolutional, fully connected, and Transformer layers, for use in any of methods fourteen to eighteen. For example, if the neural network used by the default block in filtering is a neural network based on convolutional layers, then the selected neural network is determined to be a neural network based on convolutional layers.
[0291] In this embodiment, by including at least one of the following neural networks: a neural network based on convolutional layers, a neural network based on Transformer, and a neural network based on a hybrid convolutional and fully connected layer and Transformer, at least one image block can be filtered. Furthermore, filtering can be performed by combining at least one filter lookup table, thereby further improving the filtering effect and thus improving the video encoding and / or decoding effect.
[0292] Optionally, when the neural network includes a convolutional layer-based neural network, in method fourteen, determining or generating at least one first intermediate value based on the neural network and at least one reference parameter includes steps S11 and S12:
[0293] Step S11: Process at least one image block of the input neural network containing convolutional modules to obtain at least one second image block;
[0294] Optionally, processing at least one image patch may include rotation, flipping, and scaling.
[0295] Optionally, the convolution module includes at least one convolutional layer.
[0296] Optionally, step S11 may include rotating at least one image patch of the input neural network containing convolutional modules to obtain at least one second image patch. Optionally, the at least one image patch may be rotated multiple times to obtain multiple second image patches, and the rotation angle may be the same each time, such as 90 degrees.
[0297] Optionally, at least one image patch and reference parameters can be input into a neural network containing a convolutional module, and the at least one image patch can be rotated to obtain at least one second image patch.
[0298] Step S12: Process at least one second image block according to the convolution module and reference parameters to determine or generate at least one first intermediate value.
[0299] Optionally, each second image block can be convolved using a convolution module and reference parameters to determine or generate an image block including at least one first intermediate value.
[0300] Optionally, steps S11 and S12 can also be performed when the neural network includes a neural network based on hybrid convolutional and fully connected layers and Transformer.
[0301] In this embodiment, by using a convolutional layer-based neural network for filtering, at least one input image block can be processed to obtain at least one second image block. The at least one second image block is then processed based on the convolutional modules and reference parameters of the neural network to determine or generate a first intermediate value. This is then combined with at least one filter lookup table for filtering to determine or generate a target image block. Utilizing both the neural network and the filter lookup table for filtering improves the filtering effect, thereby enhancing the video encoding and / or decoding performance.
[0302] Optionally, the convolutional layer of the convolutional module includes at least one of the following methods 23 to 26.
[0303] Method 23: Asymmetric convolutional layers;
[0304] Method 24: Grouped convolutional layers;
[0305] Method 25: Partially convolutional convolutional layers;
[0306] Method 26: Convolutional layers with depth separable convolutions.
[0307] Optionally, for step S12, in the neural network, if the asymmetric convolutional layer receives at least one second image block, convolution processing can be performed on the at least one second image block in the convolutional module. Optionally, the asymmetric convolutional layer may include a convolutional layer with a 1x3 kernel, or a convolutional layer with a 1x2 kernel.
[0308] Optionally, for each second image block, the input channels can be divided into multiple groups, and the image block features input by each group of input channels can be convolved. Then, the output results of each group of output channels can be merged to obtain the convolution result of the grouped convolutional layer convolving the second image block.
[0309] Optionally, at least one second image block can be input into a partially convolutional convolutional layer, and the at least one second image block can be partially convolutional processed according to the partially convolutional convolutional layer.
[0310] Optionally, at least one second image block can be input into a depthwise separable convolutional layer, and depthwise separable convolution processing can be performed on the at least one second image block according to the depthwise separable convolutional layer.
[0311] In this embodiment, when processing at least one second image patch using the convolutional module and reference parameters of the neural network, processing can be performed using at least one of the following: asymmetric convolutional layers, grouped convolutional layers, partially convolutional layers, and depthwise separable convolutional layers, to improve the filtering effect of the neural network. Further filtering is performed using a filter lookup table to determine or generate the target image patch. Utilizing both the neural network and the filter lookup table for filtering improves the filtering effect, thereby enhancing the video encoding and / or decoding performance.
[0312] Optionally, the image processing method further includes at least one of methods 27 to 30.
[0313] Method 27: At least two convolutional layers in the convolutional module have the same convolutional template;
[0314] Method 28: At least two convolutional layers in the convolutional module have convolutional templates that are distinct from each other.
[0315] Method 29: The pixel positions where the convolution kernels act are different in at least two convolution templates;
[0316] Method 30: Determine the number and position of pixels in the convolution template based on the receptive field.
[0317] Optionally, when performing convolution processing on at least one second image block in the neural network based on the convolutional module and reference parameters, the second image block can be convolved based on each convolutional layer in the convolutional module, and each convolutional layer can use a corresponding convolutional template to convolve the second image block. The convolutional template can be predetermined. Furthermore, when selecting the convolutional template, it can be determined based on the receptive field of the convolutional kernel of the convolutional layer. For example, for a 1x3 convolutional kernel, the number of pixels affected by the convolutional kernel in the convolutional template corresponding to the 1x3 convolutional kernel can be determined to be 3. For example, for a 1x2 convolutional kernel, the number of pixels affected by the convolutional kernel in the convolutional template corresponding to the 1x2 convolutional kernel can be determined to be 2. Optionally, the pixel positions affected by the convolutional kernel in each convolutional template can be different or the same; therefore, there can be multiple convolutional templates corresponding to convolutional kernels with the same receptive field. For example, for the receptive field of a 3x3 convolutional kernel, there can be 84 different 1x3 convolutional templates.
[0318] Optionally, the convolution module may include multiple convolutional layers. Therefore, after the convolution module receives at least one second image block, it can sequentially use different convolutional layers for convolution processing in the convolution order. The convolution template used for each convolutional layer can be the same or different.
[0319] Optionally, at least one second image block can be convolved using a convolutional layer in the convolutional module based on a convolutional template and reference parameters, and then normalized to produce an image block including at least one first intermediate value. Optionally, the first intermediate value can be a filtered pixel after being filtered by a neural network.
[0320] Optionally, for example Figure 9 As shown, after the neural network receives the normalized input image, it can perform rotation processing on the input image. Optionally, the input image includes at least one image patch, that is, at least one image patch is rotated to obtain at least one second image patch. The rotation angle can be 90 degrees, and the rotation can be clockwise or counterclockwise four times.
[0321] When a convolutional module in a neural network performs convolution processing on at least one second image patch, at least one convolutional layer in the convolutional module can perform convolution processing on at least one second image patch according to a convolutional template. For example, the learnable parameters of a Conc1x3 convolutional layer include the convolutional template. Therefore, at this convolutional layer, pixels can be extracted from at least one second image patch according to the convolutional template to form a 1x3 image patch. Then, the convolution result is averaged with the result of the 1x3 convolutional kernel, followed by at least one 1x1 convolutional residual block (i.e., a Conv1x1 block). For example, after connecting three 1x1 convolutional residual blocks, a normalized image patch (i.e., a normalized output) is output and fed into the next convolutional module. Optionally, nine convolutional modules can be stacked to increase the network depth of the neural network, expand the receptive field, and thus enrich the extracted features. Optionally, in the nine convolutional modules, the convolutional kernel of at least one convolutional layer can use different convolutional templates. For example, the convolutional sum of the convolutional layers in each convolutional module can use different templates. Figure 10 The convolution template shown is optional. Optionally, the convolution template may include pixels I0 to I8. The number and position of pixels in the convolution template can be determined based on the receptive field of the convolution kernel. Therefore, for a 1x3 convolution kernel, the number of pixels involved in the corresponding convolution template can be determined to be 3, and they can be in different positions. For a 1x2 convolution kernel, the number of pixels involved in the corresponding convolution template can be determined to be 2, and they can be in different positions.
[0322] Optionally, such as Figure 10 As shown, a convolutional module in a neural network can include three levels, and each level can include at least one convolutional layer. For example, the first-level convolutional module includes a convolutional layer with three 1x3 kernels. The second-level convolutional module includes a convolutional layer with three 1x3 kernels. The third-level convolutional module includes a convolutional layer with three 1x3 kernels.
[0323] Optionally, in the first-level convolutional module, the pixels (i.e., selected pixels) active in the convolutional template corresponding to the first 1x3 convolutional kernel convolutional layer can be pixels I0, I1, and I2. The pixels active in the convolutional template corresponding to the second 1x3 convolutional kernel convolutional layer can be pixels I0, I4, and I5. The pixels active in the convolutional template corresponding to the third 1x3 convolutional kernel convolutional layer can be pixels I0, I7, and I8.
[0324] Optionally, in the second-level convolutional module, the pixels (i.e., selected pixels) active in the convolutional template corresponding to the first 1x3 convolutional kernel convolutional layer can be pixels I0, I1, and I2. The pixels active in the convolutional template corresponding to the second 1x3 convolutional kernel convolutional layer can be pixels I0, I4, and I7. The pixels active in the convolutional template corresponding to the third 1x3 convolutional kernel convolutional layer can be pixels I0, I5, and I8.
[0325] Optionally, in the third-level convolutional module, the pixels (i.e., selected pixels) active in the convolutional template corresponding to the first 1x3 convolutional kernel layer can be pixels I0, I1, and I2. The pixels active in the convolutional template corresponding to the second 1x3 convolutional kernel layer can be pixels I0, I4, and I8. The pixels active in the convolutional template corresponding to the third 1x3 convolutional kernel layer can be pixels I0, I5, and I7.
[0326] Optionally, such as Figure 10 As shown, in a neural network, rotating the input image patch creates a 3x3 receptive field for a convolutional layer with a 1x3 kernel. For a 3x3 kernel's receptive field, there are 84 possible 1x3 convolutional templates. (Example...) Figure 11 As shown, when at least one second image block after rotation is convolved using a convolution template corresponding to a convolution kernel with a small receptive field, the increase in receptive field is not significant. Figure 12 The asymmetric convolutional layer shown performs convolution processing on at least one second image block, achieving a convolution effect similar to that of a receptive field with an approximate 3x3 convolutional kernel. For example... Figure 13 As shown, for a 1x2 convolution kernel, the following is used: Figure 13 The convolution template shown can achieve a convolution effect similar to that of a 3x3 convolution kernel with a receptive field. For example... Figure 14 As shown, a dilated convolution kernel can be used, meaning that a 1x3 convolution kernel in the convolutional layer covers a width and / or height exceeding 3, such as covering the receptive field of a 4x4 convolution kernel. This allows the 1x3 convolution kernel to perform convolution processing on at least one rotated second image block, achieving a convolution effect similar to convolution processing with the receptive field of a 4x4 convolution kernel.
[0327] Optionally, based on Figure 9 It can optimize neural networks, such as Figure 15As shown, the normalized input image can be rotated in the neural network to obtain four second image blocks. The Conv1x3 convolutional layer uses learnable parameters (i.e., convolution templates) to perform convolution processing on each second image block. The convolution processing result is then weighted and averaged in each direction using four learnable weights for rotation. The result is then fed into a Conv1x1 convolutional layer and normalized.
[0328] Optionally, when the convolution module in the neural network performs convolution processing on the rotated second image block, it can combine the reference parameters of at least one image block for convolution processing. For example, the reference parameters can be the position information of the pixel to be filtered.
[0329] Optionally, the relative position encoding can be determined based on the position information of the pixel to be filtered, such as... Figure 16 As shown, when performing convolution processing on the rotated second image block in the Conv1x3 convolutional layer, the convolution processing can be performed together based on the convolution template and the relative position encoding. For example, at least one second image block can be processed based on the convolution template with pixels I0, I4, and I5 acting on the convolution kernel and the corresponding relative position encoding, and then connected to multiple Conv1x1 convolutional layers until a normalized image block is output.
[0330] In this embodiment, when processing at least one second image block using the convolutional module and reference parameters of the neural network, the convolutional module can be configured as follows: at least two convolutional layers in the convolutional module have the same convolutional template, and / or the convolutional templates corresponding to at least two convolutional layers include mutually different convolutional templates, and / or the pixel positions acted upon by the convolutional kernels in at least two templates are different, and / or the number and position of pixels in the convolutional template are determined according to the receptive field of the convolutional kernel, thereby improving the filtering effect of the neural network. Then, a filter lookup table is used for filtering to determine or generate the target image block. Using both the neural network and the filter lookup table for filtering improves the filtering effect, thereby improving the video encoding and / or decoding effect.
[0331] Fifth embodiment
[0332] Based on any of the above embodiments, a fifth embodiment is proposed.
[0333] In this embodiment, in method sixteen, at least one index is determined or generated based on at least one reference parameter, including at least one of the following methods thirty-one to thirty-seven.
[0334] Method 31: Determine or generate a first pixel parameter based on at least one reference parameter, and determine or generate at least one index based on the first pixel parameter;
[0335] Optionally, the pixel parameter may include values or data obtained by mathematical calculation, deformation, or combination of a pixel or a reference parameter associated with that pixel. The first pixel parameter may include values or data obtained by mathematical calculation, deformation, or combination of at least one reference parameter.
[0336] Optionally, at least one pixel to be filtered from at least one image block can be input into a mathematical formula or model, or a function, to calculate and obtain the first pixel parameter. The first pixel parameter can be indexed to determine the first pixel parameter, for example, by directly using the first pixel parameter as the index, or by deforming or otherwise processing the first pixel parameter to obtain the index, or by inputting the first pixel parameter into a table containing the mapping relationship between pixel parameters and indexes to obtain the index corresponding to the first pixel parameter.
[0337] Optionally, at least one index determined based on the first pixel parameter can be input into at least one filter lookup table to determine at least one filter pixel, and the target image block can be determined based on the at least one filter pixel.
[0338] Optionally, for method fourteen, at least one first intermediate value may be determined or generated based on a neural network and at least one reference parameter, a pixel parameter may be determined or generated based on the at least one first intermediate value, at least one index may be determined or generated based on the pixel parameter, and the corresponding filtered pixel may be determined by searching in at least one filter lookup table based on the at least one index, thereby determining or generating the target image block.
[0339] Optionally, for method 15, a first pixel parameter may be determined or generated based on at least one reference parameter, at least one index may be determined or generated based on the first pixel parameter, at least one second intermediate value may be obtained by searching in at least one filter lookup table based on the at least one index, and a target image patch may be determined or generated based on the neural network and the at least one second intermediate value.
[0340] Optionally, for method seventeen, at least one third intermediate value may be determined or generated based on a neural network and at least one reference parameter, a pixel parameter may be determined or generated based on the at least one third intermediate value, at least one index may be determined or generated based on the pixel parameter, at least one fourth intermediate value may be determined or generated based on the at least one index in at least one filter lookup table, and a target image patch may be determined or generated based on the neural network and at least one fourth intermediate value.
[0341] Optionally, for method eighteen, a pixel parameter may be determined or generated based on at least one reference parameter, at least one index may be determined or generated based on the pixel parameter, at least one fifth intermediate value may be determined or generated based on the at least one index in at least one filter lookup table, at least one fifth intermediate value may be determined or generated based on the at least one fifth intermediate value and the neural network, a new pixel parameter may be determined or generated based on the at least one sixth intermediate value (e.g., directly using the sixth intermediate value as the new pixel parameter, or deforming the sixth intermediate value, or otherwise processing it to obtain a new pixel parameter), at least one index may be determined or generated based on the new pixel parameter, and at least one index may be looked up in at least one filter lookup table based on the at least one index to determine or generate the target image patch.
[0342] In this embodiment, at least one index is determined or generated based on a first pixel parameter determined or generated according to at least one reference parameter, and the target image block is determined by searching in at least one filter lookup table according to the at least one index. By using a filter lookup table for filtering, the complexity of the filtering process can be reduced, thereby improving the efficiency of video encoding and / or decoding.
[0343] Method 32: Based on at least one reference parameter and at least one representation lookup table, determine or generate at least one first representation value, and based on at least one first representation value, determine or generate at least one index;
[0344] Optionally, the representation value can be a numerical value used to describe or represent the characteristics, attributes, etc. of an object, phenomenon, etc. It is often processed or selected to reflect key features, such as using the average score to represent the learning level of a class. Optionally, the representation value can be a sampled value. For example, for 256 pixel values in the range [0,1,2,…,255], the sampled values after sampling at a sampling interval of 4 bits are [0,16,32,48,64,80,96,112,128,144,160,176,192,208,224,240].
[0345] Optionally, the representation lookup table can be a lookup table that includes representation values, or it can be a sampling lookup table. Optionally, the representation lookup table can include the correspondence between pixels, the correspondence between representation values and non-representation values, and the correspondence between representation values and pixel values. Optionally, an index can also be set in the representation lookup table so that the corresponding representation value can be determined by looking up the corresponding representation value in the representation lookup table according to the index.
[0346] Optionally, at least one reference parameter (such as the pixel value to be filtered) can be input into at least one representation lookup table for lookup to determine the representation value corresponding to the at least one pixel value to be filtered, and then output to obtain the first representation value. Optionally, at least one reference parameter can be converted into an index that matches the representation lookup table and then input into at least one representation lookup table for lookup to determine the first representation value corresponding to the at least one pixel value to be filtered.
[0347] Optionally, the index can be determined based on the first representation value. For example, the first representation value can be directly used as the index, or the first representation value can be transformed or otherwise processed to obtain the index. Alternatively, the first representation value can be input into a table containing the mapping relationship between representation values and indexes to obtain the index corresponding to the first representation value.
[0348] Optionally, at least one index determined based on the first characterization value can be input into at least one filter lookup table to determine at least one filter pixel, and the target image block can be determined based on the at least one filter pixel.
[0349] Optionally, for method fourteen, at least one first intermediate value may be determined or generated based on a neural network and at least one reference parameter, at least one representation value may be determined or generated based on the at least one first intermediate value and at least one representation lookup table, at least one index may be determined or generated based on the at least one representation value, and the corresponding filtered pixel may be determined by searching in at least one filter lookup table based on the at least one index, thereby determining or generating the target image block.
[0350] Optionally, for method 15, at least one representation value may be determined or generated based on at least one reference parameter and at least one representation lookup table, at least one index may be determined or generated based on the at least one representation value, at least one second intermediate value may be obtained by searching in at least one filter lookup table based on the at least one index, and a target image patch may be determined or generated based on the neural network and at least one second intermediate value.
[0351] Optionally, for method seventeen, at least one third intermediate value may be determined or generated based on a neural network and at least one reference parameter, at least one representation value may be determined or generated based on the at least one third intermediate value and at least one representation lookup table, at least one index may be determined or generated based on the at least one representation value, at least one fourth intermediate value may be determined or generated based on the at least one index in at least one filter lookup table, and a target image patch may be determined or generated based on the neural network and at least one fourth intermediate value.
[0352] Optionally, for method eighteen, at least one representation value may be determined or generated based on at least one reference parameter and at least one representation lookup table; at least one index may be determined or generated based on the at least one representation value; at least one fifth intermediate value may be determined or generated by searching in at least one filter lookup table based on the at least one index; and at least one sixth intermediate value may be determined or generated based on the at least one fifth intermediate value and the neural network. At least one index may be determined or obtained based on the at least sixth intermediate value and the at least one representation value determined or generated by the at least one representation lookup table, and then searched in at least one filter lookup table based on the at least one index to determine or generate the target image patch.
[0353] In this embodiment, at least one first representation value is determined or generated based on at least one reference parameter and at least one representation lookup table. At least one index is then determined or generated based on the first representation value, and the target image block is located in at least one filter lookup table according to the index. By employing a filter lookup table for filtering, the complexity of the filtering process can be reduced, thereby improving the efficiency of video encoding and / or decoding.
[0354] Method 33: Determine or generate a second pixel parameter based on at least one reference parameter, determine or generate at least one second representation value based on the second pixel parameter and at least one representation lookup table, and determine or generate at least one index based on at least one second representation value;
[0355] Optionally, the second pixel parameter can be a value or data obtained by mathematical calculation, transformation, or combination of at least one reference parameter (such as the pixel value to be filtered). Optionally, the second pixel parameter can be the same as or different from the first pixel parameter.
[0356] Optionally, after determining or generating the second pixel parameter based on at least one reference parameter, the second pixel parameter can be input into at least one representation lookup table for lookup to determine the representation value corresponding to the second pixel parameter and output it to obtain at least one second representation value. Optionally, the second pixel parameter can be converted into an index that matches the representation lookup table and then input into at least one representation lookup table for lookup to determine the second representation value corresponding to the second pixel parameter.
[0357] Optionally, the index can be determined based on the second representation value. For example, the second representation value can be directly used as the index, or the index can be obtained by transforming or otherwise processing the second representation value. Alternatively, the second representation value can be input into a table containing the mapping relationship between representation values and indexes to obtain the index corresponding to the second representation value.
[0358] Optionally, at least one index determined based on the second characterization value can be input into at least one filter lookup table to determine at least one filter pixel, and the target image block can be determined based on the at least one filter pixel.
[0359] Optionally, for method fourteen, at least one first intermediate value may be determined or generated based on a neural network and at least one reference parameter, pixel parameters may be determined or generated based on the at least one first intermediate value, and at least one representation lookup table may be used to determine or generate at least one representation value, at least one index may be determined or generated based on the at least one representation value, and the corresponding filtered pixel may be determined by searching in at least one filter lookup table based on the at least one index, thereby determining or generating the target image block.
[0360] Optionally, for method 15, the pixel parameters are determined or generated based on at least one reference parameter and at least one representation lookup table, at least one representation value is determined or generated, at least one index is determined or generated based on the at least one representation value, at least one second intermediate value is obtained by searching in at least one filter lookup table based on the at least one index, and a target image patch is determined or generated based on the neural network and the at least one second intermediate value.
[0361] Optionally, for method seventeen, at least one third intermediate value may be determined or generated based on a neural network and at least one reference parameter, pixel parameters may be determined or generated based on the at least one third intermediate value, and at least one representation lookup table may be used to determine or generate at least one representation value, at least one index may be determined or generated based on the at least one representation value, at least one fourth intermediate value may be determined or generated based on the at least one index in at least one filter lookup table, and a target image patch may be determined or generated based on the neural network and at least one fourth intermediate value.
[0362] Optionally, for method eighteen, the process may involve determining or generating at least one representation value based on pixel parameters determined or generated according to at least one reference parameter and at least one representation lookup table; determining or generating at least one index based on the at least one representation value; searching in at least one filter lookup table based on the at least one index to determine or generate at least one fifth intermediate value; and determining or generating at least one sixth intermediate value based on the at least one fifth intermediate value and a neural network. The pixel parameters determined or generated based on the at least one sixth intermediate value, along with the at least one representation lookup table, are used to determine or generate at least one representation value. At least one index is then determined or obtained based on the latest determined or generated at least one representation value, and searched in at least one filter lookup table based on the at least one index to determine or generate the target image patch.
[0363] In this embodiment, a second pixel parameter is determined or generated based on at least one reference parameter, at least one second representation value is determined or generated based on the second pixel parameter and at least one representation lookup table, at least one index is determined or generated based on the at least one second representation value, and the target image block is determined by searching in at least one filter lookup table according to the at least one index. By using a filter lookup table for filtering, the complexity of the filtering process can be reduced, thereby improving the efficiency of video encoding and / or decoding.
[0364] Method 34: Determine or generate at least one third characterization value based on at least one reference parameter and at least one characterization lookup table; determine or generate a third pixel parameter based on at least one third characterization value; determine or generate at least one index based on the third pixel parameter.
[0365] Optionally, the third pixel parameter may be the same as or different from the first pixel parameter and / or the second pixel parameter.
[0366] Optionally, reference parameters of at least one image patch can be input into at least one representation lookup table for parallel or serial lookup to determine and output the representation value corresponding to the at least one reference parameter, thereby obtaining at least one third representation value. Optionally, at least one reference parameter can be converted into an index that matches the at least one representation lookup table and then input into the at least one representation lookup table for lookup to determine the at least one third representation value.
[0367] Optionally, the third characterization value can be processed to obtain the third pixel parameter, such as by performing mathematical calculations, deformation, or combination on the third characterization value to obtain the value or data as the third pixel parameter.
[0368] Optionally, the index can be determined based on the third pixel parameter. For example, the third pixel parameter can be directly used as the index, or the index can be obtained by transforming or otherwise processing the third pixel parameter. Alternatively, the third pixel parameter can be input into a table containing the mapping relationship between pixel parameters and indexes to obtain the index corresponding to the third pixel parameter.
[0369] Optionally, at least one index determined based on the third pixel parameter can be input into at least one filter lookup table to determine at least one filter pixel, and the target image block can be determined based on the at least one filter pixel.
[0370] Optionally, for method fourteen, at least one first intermediate value may be determined or generated based on a neural network and at least one reference parameter, at least one representation value may be determined or generated based on the at least one intermediate value and at least one representation lookup table, pixel parameters may be determined or generated based on the at least one representation value, at least one index may be determined or generated based on the pixel parameters, and the corresponding filtered pixel may be determined by searching in at least one filter lookup table based on the at least one index, thereby determining or generating the target image block.
[0371] Optionally, for method 15, at least one representation value may be determined or generated based on at least one reference parameter and at least one representation lookup table, a pixel parameter may be determined or generated based on the at least one representation value, at least one index may be determined or generated based on the pixel parameter, at least one second intermediate value may be obtained by searching in at least one filter lookup table based on the at least one index, and a target image patch may be determined or generated based on a neural network and at least one second intermediate value.
[0372] Optionally, for method seventeen, at least one third intermediate value may be determined or generated based on a neural network and at least one reference parameter, at least one representation value may be determined or generated based on the at least one third intermediate value and at least one representation lookup table, pixel parameters may be determined or generated based on the at least one representation value, at least one index may be determined or generated based on the pixel parameters, at least one fourth intermediate value may be determined or generated based on the at least one index in at least one filter lookup table, and a target image patch may be determined or generated based on the neural network and at least one fourth intermediate value.
[0373] Optionally, for method eighteen, at least one representation value may be determined or generated based on at least one reference parameter and at least one representation lookup table, a pixel parameter may be determined or generated based on the at least one representation value, at least one index may be determined or generated based on the pixel parameter, at least one fifth intermediate value may be determined or generated based on the at least one index in at least one filter lookup table, and at least one fifth intermediate value may be determined or generated based on the at least one fifth intermediate value and a neural network.
[0374] At least one representation value is determined or generated based on at least one sixth intermediate value and at least one representation lookup table, a pixel parameter is determined or generated based on the at least one representation value, at least one index is determined or obtained based on the pixel parameter, and a target image patch is determined or generated based on the at least one index in at least one filter lookup table.
[0375] In this embodiment, at least one third representation value is determined or generated based on at least one reference parameter and at least one representation lookup table. A third pixel parameter is determined or generated based on the at least one third representation value. At least one index is determined or generated based on the third pixel parameter. The target image block is then determined by searching in at least one filter lookup table according to the at least one index. By using a filter lookup table for filtering, the complexity of the filtering process can be reduced, thereby improving the efficiency of video encoding and / or decoding.
[0376] Method 35: Determine or generate at least one fourth characterization value based on the most significant bit of at least one pixel value to be filtered, and determine or generate at least one index based on the at least one fourth characterization value;
[0377] Optionally, the most significant bit can be one of the first few significant bits of the pixel value when represented in binary code. The first few significant bits can be the highest bit, the highest four bits, etc., and can be set according to user needs. For example, for an 8-bit pixel, the first four significant bits of the pixel value can be used as the most significant bit. If pixel 1 = 36 (00100100), then the most significant bit of pixel 1 is 2 (i.e., 0010).
[0378] Optionally, the most significant bits of at least one pixel value to be filtered in at least one image block can be determined, and at least one representation value can be determined based on the most significant bits. For example, the most significant bits of each of the at least one pixel values to be filtered can be processed to obtain at least one fourth representation value. For example, the most significant bits of the pixel value to be filtered can be directly used as the fourth representation value. Alternatively, the most significant bits can be transformed, weighted, or otherwise processed to obtain the fourth representation value.
[0379] Optionally, the index can be determined based on at least one fourth representation value. For example, the fourth representation value can be used directly as the index, or the fourth representation value can be transformed or otherwise processed to obtain the index. Alternatively, the fourth representation value can be input into a table containing the mapping relationship between representation values and indexes to obtain the index corresponding to the fourth representation value.
[0380] Optionally, at least one index determined based on the fourth characterization value can be input into at least one filter lookup table to determine at least one filter pixel, and the target image block can be determined based on the at least one filter pixel.
[0381] Optionally, for method fourteen, at least one first intermediate value may be determined or generated based on a neural network and at least one reference parameter, at least one representation value may be determined or generated based on the high-significant bits of the at least one intermediate value, at least one index may be determined based on the at least one representation value, and the corresponding filtered pixel may be determined by searching in at least one filter lookup table based on the at least one index, thereby determining or generating the target image block.
[0382] Optionally, for method 15, at least one representation value may be determined or generated based on the high-significant bits of the pixel value to be filtered in at least one reference parameter, at least one index may be determined or generated based on the at least one representation value, at least one second intermediate value may be obtained by searching in at least one filter lookup table based on the at least one index, and a target image patch may be determined or generated based on the neural network and the at least one second intermediate value.
[0383] Optionally, for method seventeen, at least one third intermediate value may be determined or generated based on a neural network and at least one reference parameter, at least one representation value may be determined or generated based on the high-significant bits of the at least one third intermediate value, at least one index may be determined or generated based on the at least one representation value, at least one fourth intermediate value may be determined or generated based on the at least one index in at least one filter lookup table, and a target image patch may be determined or generated based on the neural network and at least one fourth intermediate value.
[0384] Optionally, for method eighteen, at least one representation value may be determined or generated based on the high-significant bits of the pixel value to be filtered in at least one reference parameter, at least one index may be determined or generated based on the at least one representation value, at least one fifth intermediate value may be determined or generated based on the at least one index in at least one filter lookup table, and at least one sixth intermediate value may be determined or generated based on the at least one fifth intermediate value and the neural network.
[0385] At least one representation value is determined or generated based on the high-significant bits of at least one sixth intermediate value, at least one index is determined or obtained based on the at least one representation value, and a target image patch is determined or generated based on the at least one index in at least one filter lookup table.
[0386] In this embodiment, at least one fourth representation value is determined or generated based on the most significant bit of at least one pixel value to be filtered, at least one index is determined or generated based on the at least one fourth representation value, and the target image block is determined by searching in at least one filter lookup table according to the at least one index. By using a filter lookup table for filtering, the complexity of the filtering process can be reduced, thereby improving the efficiency of video encoding and / or decoding.
[0387] Method 36: Determine or generate at least one fifth characterization value based on the characterization parameter, and determine or generate at least one index based on the at least one fifth characterization value;
[0388] Optionally, the characterization parameter can be any parameter corresponding to the characterization value, such as the interval between characterization values (i.e., the characterization interval). Optionally, when the characterization value is a sampled value, the characterization parameter can be a sampling interval.
[0389] Optionally, at least one fifth characterization value can be determined based on at least one reference parameter (such as the pixel value to be filtered) and a characterization parameter. The pixel value of a pixel that is a characterization interval away from the pixel to be filtered can be used as the fifth characterization value. Alternatively, the fifth characterization value can be determined based on a multiple of the characterization interval and the pixel to be filtered. For example, the pixel value of a pixel that is a characterization interval a preset multiple of the pixel to be filtered can be used as the fifth characterization value.
[0390] Optionally, the index can be determined based on the fifth representation value. For example, the fifth representation value can be directly used as the index, or the index can be obtained by transforming or otherwise processing the fifth representation value. Alternatively, the fifth representation value can be input into a table containing the mapping relationship between representation values and indexes to obtain the index corresponding to the fifth representation value.
[0391] Optionally, at least one index determined based on the fifth characterization value can be input into at least one filter lookup table to determine at least one filter pixel, and the target image block can be determined based on the at least one filter pixel.
[0392] Optionally, for method fourteen, at least one first intermediate value may be determined or generated based on a neural network and at least one reference parameter, a representation value corresponding to the at least one first intermediate value may be determined based on a representation parameter, at least one index may be generated based on the at least one representation value, and a corresponding filtered pixel may be determined by searching in at least one filter lookup table based on the at least one index, thereby determining or generating a target image block.
[0393] Optionally, for method 15, the following steps may be taken: determining the representation value corresponding to at least one reference parameter based on the representation parameter; determining or generating at least one index based on the at least one representation value; searching in at least one filter lookup table based on the at least one index to obtain at least one second intermediate value; and determining or generating the target image patch based on the neural network and the at least one second intermediate value.
[0394] Optionally, for method seventeen, at least one third intermediate value may be determined or generated based on a neural network and at least one reference parameter, a representation value corresponding to the at least one third intermediate value may be determined or generated based on a representation parameter, at least one index may be determined or generated based on the at least one representation value, at least one fourth intermediate value may be determined or generated based on the at least one index in at least one filter lookup table, and a target image patch may be determined or generated based on the neural network and at least one fourth intermediate value.
[0395] Optionally, for method eighteen, it may be to determine the representation value corresponding to at least one reference parameter based on the representation parameter, determine or generate at least one index based on the at least one representation value, search in at least one filter lookup table based on the at least one index, determine or generate at least one fifth intermediate value, and determine or generate at least one sixth intermediate value based on the at least one fifth intermediate value and the neural network.
[0396] Based on the representation parameters, determine the representation value corresponding to at least one sixth intermediate value, determine or obtain at least one index based on the at least one representation value, and search in at least one filter lookup table based on the at least one index to determine or generate the target image patch.
[0397] In this embodiment, at least one fifth characterization value is determined or generated based on the characterization parameters, at least one index is determined or generated based on the at least one fifth characterization value, and the target image block is determined by searching in at least one filter lookup table according to the at least one index. By using a filter lookup table for filtering, the complexity of the filtering process can be reduced, thereby improving the efficiency of video encoding and / or decoding.
[0398] Method 37: Determine or generate at least one non-representational value based on at least one reference parameter, and determine or generate at least one index based on at least one non-representational value.
[0399] Optionally, non-representational values can be data or pixel values other than representational values. For example, when the representational value is a sampled value, the non-representational value can be a non-sampled value. That is, if each pixel (e.g., the pixel to be filtered) in at least one image block is sampled according to a certain rule (e.g., sampling interval) to obtain each sampled value, then the non-sampled value is the pixel value that was not sampled.
[0400] Optionally, at least one non-representational value can be determined or obtained based on at least one pixel value to be filtered in at least one image block using certain rules. Optionally, the at least one non-representational value determined based on at least one filtered pixel value can be further processed based on at least one reference parameter, such as through format conversion, to obtain a new at least one non-representational value.
[0401] Optionally, at least one non-representational value is converted into at least one index (e.g., the non-representational value is directly used as the index, or the non-representational value is deformed or otherwise processed to obtain the index), and the at least one index is input into at least one filter lookup table for lookup to determine at least one filter pixel, and the target image block is determined or generated based on the at least one filter pixel.
[0402] Alternatively, an interpolation method can be used to determine the nearest representation value to the pixel corresponding to at least one non-representation value, and the representation value can be converted into an index and then entered into at least one filter lookup table corresponding to the representation value for lookup to determine at least one filter pixel, and the target image block can be determined or generated based on the at least one filter pixel.
[0403] Optionally, for method fourteen, at least one first intermediate value may be determined or generated based on a neural network and at least one reference parameter, at least one non-representational value may be determined based on the at least one first intermediate value, at least one index may be generated based on the at least one non-representational value, and the corresponding filtered pixel may be determined by searching in at least one filter lookup table based on the at least one index, thereby determining or generating the target image block.
[0404] Optionally, for method 15, at least one non-representational value may be determined or generated based on at least one reference parameter, at least one index may be determined or generated based on at least one non-representational value, at least one second intermediate value may be obtained by searching in at least one filter lookup table based on at least one index, and a target image patch may be determined or generated based on a neural network and at least one second intermediate value.
[0405] Optionally, for method seventeen, at least one third intermediate value may be determined or generated based on a neural network and at least one reference parameter, at least one non-representational value may be determined or generated based on the at least one third intermediate value, at least one index may be determined or generated based on the at least one non-representational value, at least one fourth intermediate value may be determined or generated based on the at least one index in at least one filter lookup table, and a target image patch may be determined or generated based on the neural network and at least one fourth intermediate value.
[0406] Optionally, for method eighteen, at least one non-representational value may be determined or generated based on at least one reference parameter, at least one index may be determined or generated based on at least one non-representational value, at least one fifth intermediate value may be determined or generated based on at least one index in at least one filter lookup table, and at least one sixth intermediate value may be determined or generated based on at least one fifth intermediate value and a neural network.
[0407] At least one non-representational value is determined or obtained based on at least one sixth intermediate value, at least one index is determined or generated based on at least one non-representational value, and a target image patch is determined or generated based on at least one filter lookup table according to at least one index.
[0408] In this embodiment, at least one non-representational value is determined or generated based on at least one reference parameter, at least one index is determined or generated based on at least one non-representational value, and the target image block is determined by searching in at least one filter lookup table according to the at least one index. By using a filter lookup table for filtering, the complexity of the filtering process can be reduced, thereby improving the efficiency of video encoding and / or decoding.
[0409] Sixth Embodiment
[0410] This application also provides a processing apparatus, referring to... Figure 17 The processing device includes:
[0411] Processing module A10 is used to determine or generate a target image block based on reference parameters of at least one image block and a filter lookup table.
[0412] Optionally, the reference parameters include at least one of the following: quantization parameters; boundary strength; characterization interval; position information of the pixel to be filtered; pixel value to be filtered; size information of the image block; filtering information of neighboring blocks; filtering information of non-neighboring blocks; filtering information across component blocks; filtering information of co-position blocks; filtering information of temporal blocks; filtering information of default blocks; and filtering information of candidate blocks.
[0413] Optionally, candidate blocks are determined by motion vectors and / or block vectors.
[0414] Optionally, processing module A10 is configured to perform at least one of the following:
[0415] At least one first intermediate value is determined or generated based on a neural network and at least one reference parameter, and a target image patch is determined or generated based on the at least one first intermediate value and at least one filter lookup table.
[0416] Based on at least one reference parameter, at least one second intermediate value is obtained by searching in at least one filter lookup table, and a target image patch is determined or generated based on the neural network and the at least one second intermediate value.
[0417] Determine or generate at least one index based on at least one reference parameter, and determine or generate a target image patch based on at least one index and at least one filter lookup table;
[0418] Based on a neural network and at least one reference parameter, determine or generate at least one third intermediate value; based on at least one third intermediate value and at least one filter lookup table, determine or generate at least one fourth intermediate value; based on a neural network and at least one fourth intermediate value, determine or generate a target image patch.
[0419] At least one fifth intermediate value is determined or generated based on at least one reference parameter and at least one filter lookup table; at least one sixth intermediate value is determined or generated based on at least one fifth intermediate value and a neural network; and a target image patch is determined or generated based on at least one sixth intermediate value and at least one filter lookup table.
[0420] Optionally, the neural network includes at least one of the following: a neural network based on fully connected layers; a neural network based on convolutional layers; a neural network based on Transformer; and a neural network based on a hybrid of convolutional and fully connected layers and Transformer.
[0421] Optionally, when the neural network includes a convolutional layer-based neural network, the processing module A10 is used to:
[0422] At least one image block input to a neural network containing a convolutional module is processed to obtain at least one second image block;
[0423] At least one second image block is processed based on the convolution module and reference parameters to determine or generate at least one first intermediate value.
[0424] Optionally, the convolutional layers of the convolutional module include at least one of the following: asymmetric convolutional layers; grouped convolutional layers; partially convolutional layers; and depthwise separable convolutional layers.
[0425] Optionally, the processing module A10 further includes at least one of the following:
[0426] In a convolutional module, at least two convolutional layers have the same convolutional template; at least two convolutional layers in a convolutional module have convolutional templates that are different from each other; at least two convolutional templates have different pixel positions where the convolutional kernel acts; the number and position of pixels in the convolutional template are determined based on the receptive field.
[0427] Optionally, processing module A10 is configured to perform at least one of the following:
[0428] Determine or generate a first pixel parameter based on at least one reference parameter, and determine or generate at least one index based on the first pixel parameter;
[0429] Based on at least one reference parameter and at least one representation lookup table, determine or generate at least one first representation value, and based on at least one first representation value, determine or generate at least one index;
[0430] A second pixel parameter is determined or generated based on at least one reference parameter, a second representation value is determined or generated based on the second pixel parameter and at least one representation lookup table, and an index is determined or generated based on the at least one second representation value.
[0431] Determine or generate at least one third characterization value based on at least one reference parameter and at least one characterization lookup table; determine or generate a third pixel parameter based on at least one third characterization value; determine or generate at least one index based on the third pixel parameter.
[0432] Determine or generate at least one fourth characterization value based on the most significant bit of at least one pixel value to be filtered, and determine or generate at least one index based on the at least one fourth characterization value;
[0433] Determine or generate at least one fifth characterization value based on the characterization parameters, and determine or generate at least one index based on the at least one fifth characterization value;
[0434] Determine or generate at least one non-representational value based on at least one reference parameter, and determine or generate at least one index based on at least one non-representational value.
[0435] The processing device provided in this application embodiment is similar in implementation principle and beneficial effect to the technical solution shown in the corresponding method embodiment above, and will not be described again here.
[0436] This application also provides a processing device, including a memory and a processor. The memory stores an image processing program, and when the image processing program is executed by the processor, it implements the steps of the image processing method in any of the above embodiments.
[0437] This application also provides a storage medium storing an image processing program, which, when executed by a processor, implements the steps of the image processing method in any of the above embodiments.
[0438] In the embodiments of the smart terminal and storage medium provided in this application, all the technical features of any of the above-described image processing method embodiments may be included. The extended and explanatory content of the specification is basically the same as that of the embodiments of the above methods, and will not be repeated here.
[0439] This application also provides a computer program product, which includes computer program code. When the computer program code is run on a computer, it causes the computer to perform the methods described in the various possible implementations above.
[0440] This application also provides a chip, including a memory and a processor. The memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that a device with the chip installed performs the methods described in the various possible implementations above.
[0441] It is understood that the above scenarios are merely examples and do not constitute a limitation on the application scenarios of the technical solutions provided in the embodiments of this application. The technical solutions of this application can also be applied to other scenarios. For example, as those skilled in the art will know, with the evolution of system architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of this application are also applicable to similar technical problems.
[0442] The sequence numbers of the embodiments in this application are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.
[0443] The steps in the method of this application embodiment can be adjusted, combined, or deleted according to actual needs.
[0444] The units in the device of this application embodiment can be merged, divided, and deleted according to actual needs.
[0445] In this application, the same or similar terms, concepts, technical solutions and / or application scenario descriptions are generally described in detail only when they appear for the first time. When they appear again, they are generally not repeated for the sake of brevity. When understanding the technical solutions and other contents of this application, the same or similar terms, concepts, technical solutions and / or application scenario descriptions that are not described in detail later can be referred to their previous relevant detailed descriptions.
[0446] In this application, the descriptions of the various embodiments have different focuses. For parts that are not described in detail or recorded in a certain embodiment, please refer to the relevant descriptions of other embodiments.
[0447] The technical features of the present application can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of the present application.
[0448] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) as described above, and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, controlled terminal, or network device, etc.) to execute the methods of each embodiment of this application.
[0449] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented, in whole or in part, as a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of this application is generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a storage medium or transmitted from one storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.) means. The storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, storage disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., a solid-state disk (SSD)).
[0450] The above are merely preferred embodiments of this application and do not limit the patent scope of this application. Any equivalent structural or procedural transformations made using the content of this application's specification and drawings, or direct or indirect applications in other related technical fields, are similarly included within the patent protection scope of this application.
Claims
1. An image processing method, characterized in that, Including the following steps: S10, determine or generate a target image block based on reference parameters of at least one image block and a filter lookup table; The filter lookup table is a multi-level filter lookup table, and step S10 includes: At least one first index is determined or generated based on at least one reference parameter. The first index is then input into the first layer of the multi-layer filter lookup table for searching to obtain a first search result. A second index is then determined or obtained based on the first search result. The second index is then input into the second layer of the multi-layer filter lookup table for searching until the last layer of the filter lookup table is output to obtain the target image block.
2. The method as described in claim 1, characterized in that, The reference parameters include at least one of the following: Quantization parameters; Boundary strength; Characterization parameters; The position information of the pixels to be filtered; The pixel value to be filtered; Image patch size information; Filtering information for neighboring blocks; Filtering information for non-neighboring blocks; Filtering information across component blocks; Filtering information of the same block; Filtering information for the time-domain block; Filtering information for the default block; Filtering information for candidate blocks.
3. The method as described in claim 2, characterized in that, Candidate blocks are determined by motion vectors and / or block vectors; and / or, step S10 includes at least one of the following: Based on a neural network and at least one reference parameter, at least one first intermediate value is determined or generated. Based on the at least one first intermediate value and at least one filter lookup table, a target image block is determined or generated. Specifically, the at least one first intermediate value is converted into an index and input into the first layer of the multi-layer filter lookup table for searching to obtain a first search result. Then, a second index is determined or obtained based on the first search result, and the second index is input into the second layer of the multi-layer filter lookup table for searching until the last layer of the filter lookup table is output to obtain the target image block. The process involves searching at least one filter lookup table based on at least one reference parameter to obtain at least one second intermediate value, and then determining or generating a target image patch based on a neural network and the at least one second intermediate value. Specifically, the at least one reference parameter is converted into an index and input into the first layer of a multi-layer filter lookup table for searching to obtain a first search result. A second index is then determined or obtained based on the first search result, and the second index is input into the second layer of the multi-layer filter lookup table for searching, continuing until the last layer of the filter lookup table, to obtain at least one second intermediate value. Finally, the target image patch is determined or generated based on a neural network and the at least one second intermediate value. Based on a neural network and at least one reference parameter, at least one third intermediate value is determined or generated. Based on the at least one third intermediate value and at least one filter lookup table, at least one fourth intermediate value is determined or generated. Based on the neural network and at least one fourth intermediate value, a target image patch is determined or generated. Specifically, the at least one third intermediate value is converted into an index and input into the first layer of the multi-layer filter lookup table for searching, obtaining a first search result. Then, based on the first search result, a second index is determined or obtained. The second index is input into the second layer of the multi-layer filter lookup table for searching, until the last layer of the filter lookup table, obtaining at least one fourth intermediate value. Based on the neural network and at least one fourth intermediate value, a target image patch is determined or generated. The process involves determining or generating at least one fifth intermediate value based on at least one reference parameter and at least one filter lookup table, determining or generating at least one sixth intermediate value based on at least one fifth intermediate value and a neural network, and determining or generating a target image patch based on at least one sixth intermediate value and at least one filter lookup table. Specifically, at least one reference parameter is converted into an index and input into the first layer of the multi-layer filter lookup table for searching, yielding a first search result. A second index is then determined or obtained based on the first search result, and the second index is input into the second layer of the multi-layer filter lookup table for searching, continuing until the last layer of the filter lookup table, yielding at least one fifth intermediate value. At least one sixth intermediate value is then determined or generated based on the at least fifth intermediate value and the neural network, and the target image patch is determined or generated based on the at least sixth intermediate value and at least one filter lookup table.
4. The method as described in claim 3, characterized in that, Neural networks include at least one of the following: Neural networks based on fully connected layers; Neural networks based on convolutional layers; Neural networks based on Transformer; Neural networks based on hybrid convolutional and fully connected layers and Transformers.
5. The method as described in claim 3, characterized in that, When the neural network includes a convolutional layer-based neural network, determining or generating at least one first intermediate value based on the neural network and at least one reference parameter includes: At least one image block input to a neural network containing a convolutional module is processed to obtain at least one second image block; At least one second image block is processed based on the convolution module and reference parameters to determine or generate at least one first intermediate value.
6. The method as described in claim 5, characterized in that, The convolutional layers of the convolutional module include at least one of the following: Asymmetric convolutional layers; Grouped convolutional layers; Partially convolutional convolutional layers; Depth-separable convolutional layers.
7. The method as described in claim 5, characterized in that, It also includes at least one of the following: At least two convolutional layers in a convolutional module have the same convolutional template; In a convolutional module, at least two convolutional layers correspond to convolutional templates that are all different from each other. The pixel positions where the convolution kernel operates are different in at least two convolution templates; The number and position of pixels in the convolution template are determined based on the receptive field.
8. The method as described in claim 3, characterized in that, At least one index is determined or generated based on at least one reference parameter, including at least one of the following: Determine or generate a first pixel parameter based on at least one reference parameter, and determine or generate at least one index based on the first pixel parameter; Based on at least one reference parameter and at least one representation lookup table, determine or generate at least one first representation value, and based on at least one first representation value, determine or generate at least one index; A second pixel parameter is determined or generated based on at least one reference parameter, a second representation value is determined or generated based on the second pixel parameter and at least one representation lookup table, and an index is determined or generated based on the at least one second representation value. Determine or generate at least one third characterization value based on at least one reference parameter and at least one characterization lookup table; determine or generate a third pixel parameter based on at least one third characterization value; determine or generate at least one index based on the third pixel parameter. Determine or generate at least one fourth characterization value based on the most significant bit of at least one pixel value to be filtered, and determine or generate at least one index based on the at least one fourth characterization value; Determine or generate at least one fifth characterization value based on the characterization parameters, and determine or generate at least one index based on the at least one fifth characterization value; Determine or generate at least one non-representational value based on at least one reference parameter, and determine or generate at least one index based on at least one non-representational value.
9. A processing device, characterized in that, include: The system includes a memory and a processor, wherein the memory stores an image processing program, and when the image processing program is executed by the processor, it implements the steps of the image processing method as described in claim 1.
10. A storage medium, characterized in that, The storage medium stores a computer program, which, when executed by a processor, implements the steps of the image processing method as described in claim 1.