Image processing method, processing device and storage medium
By comprehensively considering the reference block and derived blocks of the block to be predicted in image processing, and using neural networks and lookup tables to generate the predicted block, the problem of ignoring the information of overlapping pixel regions at the boundary in the prior art is solved, and the prediction effect and coding performance are improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHENZHEN TRANSSION HLDG CO LTD
- Filing Date
- 2025-06-17
- Publication Date
- 2026-06-23
AI Technical Summary
In image processing, existing techniques ignore the information of overlapping pixel regions at the boundaries between adjacent image blocks, which affects the prediction results.
When determining or generating prediction blocks, the derivative blocks of the reference block to be predicted are comprehensively considered, including luminance and chrominance component derivative blocks. The prediction blocks are generated using neural networks and lookup tables, and cropping and filling operations are combined to match size parameters to improve prediction performance.
It improves the prediction performance of image processing and enhances coding performance without significantly increasing computational complexity.
Smart Images

Figure CN120676142B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image processing technology, specifically to an image processing method, processing device, and storage medium. Background Technology
[0002] Existing high-efficiency video coding frameworks, such as Neural Network Based Video Coding (NNVC) and / or Enhanced Compression Model (ECM), propose a video frame coding technique to improve coding performance without significantly increasing computational complexity.
[0003] In the process of conceiving and implementing this application, the inventors discovered at least the following problems: In the prediction processing stage of the encoding and decoding process, a neural network can be used for prediction processing. However, when using a neural network for prediction, the information of the overlapping pixel region between two adjacent image blocks is ignored, which affects the prediction effect.
[0004] The preceding description is intended to provide general background information and does not necessarily constitute prior art. Summary of the Invention
[0005] To address the aforementioned technical problems, this application provides an image processing method, processing device, and storage medium, aiming to solve the technical problem of how to improve the prediction effect of prediction processing.
[0006] This application provides an image processing method, applicable to a processing device, comprising the following steps:
[0007] S1, determine or generate a prediction block based on the derived block corresponding to the reference block of at least one block to be predicted.
[0008] Optionally, the reference block of at least one block to be predicted includes a reference prediction block and / or a reference reconstruction block of the reference block.
[0009] Optionally, the reference block is determined or obtained based on at least one of the following:
[0010] The pixel to be predicted is at least one of the following: the pixel above, the non-adjacent pixel above, the pixel to the left, the non-adjacent pixel to the left, the pixel above the left, the non-adjacent pixel to the left, the pixel below the left, the non-adjacent pixel to the left, the pixel above the right, and the non-adjacent pixel to the right.
[0011] The block to be predicted is at least one of the following: neighboring block, non-neighboring block, cross-component block, co-position block, temporal block, and default block;
[0012] The width, height, size, and area of the block to be predicted;
[0013] Candidate motion vectors or candidate block vectors of the block to be predicted are used to determine or generate candidate blocks;
[0014] If the first information of the block to be predicted satisfies the first condition, then the reference block is the first reference block;
[0015] If the first information of the block to be predicted satisfies the second condition, then the reference block is the second reference block.
[0016] Optionally, the image processing method further includes at least one of the following:
[0017] The derived block includes at least one of the following: luminance component derived block, chrominance component derived block, and cross-component derived block;
[0018] The luminance component derivation block includes at least one of a first horizontal derivation block, a first vertical derivation block, and a first horizontal-vertical mixed derivation block;
[0019] The chromaticity component derivation block includes at least one of a second horizontal derivation block, a second vertical derivation block, and a second horizontal-vertical mixed derivation block;
[0020] The cross-component derivative block includes at least one of the third horizontal derivative block, the third vertical derivative block, and the third horizontal-vertical hybrid derivative block.
[0021] Optionally, step S1 includes the following steps:
[0022] S11, determine or obtain at least one derived block based on the reference reconstruction block and / or reference prediction block of at least one reference block;
[0023] S12, determine or generate a prediction block based on the neural network and / or lookup table, and at least one derived block.
[0024] Optionally, the size parameters of the derived block are matched with the size parameters of the reference reconstructed block and / or the reference prediction block.
[0025] Optionally, step S11 includes:
[0026] Crop a portion of at least one reference reconstruction block and / or a reference prediction block;
[0027] Based on the filling results of filling at least one reference reconstruction block and / or reference prediction block after trimming, at least one derived block is determined or obtained.
[0028] Optionally, at least one region of a reference reconstruction block and / or a reference prediction block is determined or obtained based on at least one of the following:
[0029] At least one of the following: at least one leftmost column, at least one rightmost column, at least one topmost row, and at least one bottommost row of a reference reconstruction block and / or a reference prediction block;
[0030] Transform dimensional parameters;
[0031] Cutting dimension parameters;
[0032] The sliding area size and / or the preset sliding step size of the sliding window on at least one reconstruction block and / or prediction block.
[0033] Optionally, the image processing method further includes at least one of the following:
[0034] The cutting dimension parameter is less than or equal to the transformation dimension parameter;
[0035] The transformation dimension parameters include at least one of the transformation width and transformation height;
[0036] The transformation width is greater than or equal to the number of columns that are cropped and / or filled in a portion of at least one reference reconstruction block and / or reference prediction block;
[0037] The transformation height is greater than or equal to the number of rows in which a portion of at least one reference reconstruction block and / or reference prediction block is clipped and / or filled.
[0038] Optionally, the padding method for at least one of the clipped reference reconstruction blocks and / or reference prediction blocks includes at least one of the following:
[0039] Fill the leftmost or rightmost column of pixels with an even number of pixels for at least one reference reconstruction block and / or reference prediction block after cropping;
[0040] Fill the top or bottom of at least one cropped reference reconstruction block and / or reference prediction block with an even number of rows of pixels.
[0041] Optionally, the image processing method further includes at least one of the following:
[0042] Transform at least one of the first horizontal derived block, the first vertical derived block, and the first horizontal-vertical hybrid derived block to obtain the first transformation feature;
[0043] Transform at least one of the second horizontal derived block, the second vertical derived block, and the second horizontal-vertical hybrid derived block to obtain the second transformation feature;
[0044] Transform at least one of the third horizontal derived block, the third vertical derived block, and the third horizontal-vertical hybrid derived block to obtain the third transformation feature;
[0045] The reconstructed image information and / or predicted image information of at least one reference reconstruction block and / or reference prediction block are transformed to obtain the fourth transformation feature.
[0046] Optionally, step S1 includes at least one of the following:
[0047] Based on the neural network and / or lookup table, and the result of channel splicing of the first transformation feature and the fourth transformation feature, a first feature set is determined or obtained, and a prediction block is determined or generated based on the first feature set;
[0048] Based on the neural network and / or lookup table, and the results of channel splicing of the second and fourth transformation features, a first feature set is determined or obtained, and a prediction block is determined or generated based on the first feature set.
[0049] Based on the neural network and / or lookup table, and the result of channel splicing of the first transformation feature, the second transformation feature and the fourth transformation feature, the first feature set is determined or obtained, and the prediction block is determined or generated based on the first feature set;
[0050] Based on the neural network and / or lookup table, and the results of channel splicing of the third and fourth transformation features, a first feature set is determined or obtained, and a prediction block is determined or generated based on the first feature set.
[0051] Optionally, the prediction block is determined or generated based on the first feature set, including at least one of the following:
[0052] The convolution module of the neural network performs convolution on some features in the first feature set to obtain the first convolution feature. The first convolution feature and the non-convolutioned features in the first feature set are fused together to determine or generate the prediction block.
[0053] The first lookup table is used to search for some features in the first feature set to obtain the first lookup table features. The first lookup table features and the features not found in the first feature set are fused together to determine or generate the prediction block.
[0054] This application also provides a processing apparatus, including:
[0055] The processing module is used to determine or generate a prediction block based on the derived block corresponding to the reference block of at least one block to be predicted.
[0056] This application also provides a processing device, including: a memory and a processor, wherein the memory stores an image processing program, and when the image processing program is executed by the processor, it implements the steps of any of the image processing methods described above.
[0057] This application also provides a storage medium storing a computer program that, when executed by a processor, implements the steps of any of the image processing methods described above.
[0058] As described above, the image processing method of this application can be applied to a processing device, including: determining or generating a prediction block based on derived blocks of at least one reference block of a block to be predicted. Through the technical solution of this application, when determining or generating a prediction block, the derived blocks of at least one reference block of a block to be predicted are comprehensively considered, which can improve the prediction effect of the prediction processing. Attached Figure Description
[0059] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application. To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the description of the embodiments will be briefly introduced below. Obviously, those skilled in the art can obtain other drawings based on these drawings without any creative effort.
[0060] Figure 1 A schematic diagram of the hardware structure of a mobile terminal to implement the various embodiments of this application;
[0061] Figure 2 A communication network system architecture diagram provided for an embodiment of this application;
[0062] Figure 3 A schematic diagram of the hardware structure of a controller 140 provided in this application;
[0063] Figure 4 A schematic diagram of the hardware structure of a network node 150 provided in this application;
[0064] Figure 5 This is a flowchart illustrating the image processing method according to the first embodiment;
[0065] Figure 6 This is a flowchart illustrating the image processing method according to the third embodiment;
[0066] Figure 7 A schematic diagram of a horizontally derived block provided in this application;
[0067] Figure 8 A schematic diagram of a vertically derived block provided in this application;
[0068] Figure 9 A schematic diagram of a horizontal and vertical hybrid derivative block provided in this application;
[0069] Figure 10This is a schematic diagram of the processing module of the processing device.
[0070] The realization of the objectives, functional features, and advantages of this application will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. The accompanying drawings have illustrated specific embodiments of this application, which will be described in more detail below. These drawings and textual descriptions are not intended to limit the scope of the concept in any way, but rather to illustrate the concepts of this application to those skilled in the art through reference to specific embodiments. Detailed Implementation
[0071] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.
[0072] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element. Furthermore, components, features, and elements with the same names in different embodiments of this application may have the same meaning or different meanings, the specific meaning of which must be determined by its interpretation in that specific embodiment or further in conjunction with the context of that specific embodiment.
[0073] It should be understood that although the terms first, second, third, etc., may be used herein to describe various information, such information should not be limited to these terms. These terms are used only to distinguish information of the same type from one another. For example, without departing from the scope of this document, first information may also be referred to as second information, and similarly, second information may also be referred to as first information. Depending on the context, the word “if” as used herein may be interpreted as “when…” or “in response to determination”. Furthermore, as used herein, the singular forms “a,” “an,” and “the” are intended to also include the plural forms unless the context indicates otherwise. It should be further understood that the terms “comprising,” “including,” indicate the presence of the stated feature, step, operation, element, component, item, kind, and / or group, but do not exclude the presence, occurrence, or addition of one or more other features, steps, operations, elements, components, items, kinds, and / or groups. The terms “or,” “and / or,” “including at least one of the following,” etc., as used in this application may be interpreted as inclusive, or mean any one or any combination thereof. For example, "including at least one of the following: A, B, C" means "any one of the following: A; B; C; A and B; A and C; B and C; A and B and C." Similarly, "A, B, or C" or "A, B, and / or C" means "any one of the following: A; B; C; A and B; A and C; B and C; A and B and C." Exceptions to this definition only occur when the combination of elements, functions, steps, or operations is inherently mutually exclusive in some way.
[0074] It should be understood that although the steps in the flowcharts of this application's embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some of the steps in the figures may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times, and their execution order is not necessarily sequential, but can be performed alternately or in turn with other steps or at least a portion of the sub-steps or stages of other steps.
[0075] Depending on the context, the words “if” or “suppose” as used here can be interpreted as “when” or “in response to determination” or “in response to detection.” Similarly, depending on the context, the phrases “if determination” or “if detection (of the stated condition or event)” can be interpreted as “when determination” or “in response to determination” or “when detection (of the stated condition or event)” or “in response to detection (of the stated condition or event).”
[0076] It should be noted that step designations such as S11 and S12 are used in this document for the purpose of more clearly and concisely describing the corresponding content, and do not constitute a substantial limitation on the order. In specific implementation, those skilled in the art may execute S12 first and then S11, etc., but these should all be within the protection scope of this application.
[0077] It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to limit this application.
[0078] In the following description, the use of suffixes such as "module," "part," or "unit" to denote elements is solely for the purpose of illustrative purposes and has no specific meaning in itself. Therefore, "module," "part," or "unit" may be used interchangeably.
[0079] The processing device in this application can be a smart terminal or a server, and the smart terminal can be implemented in various forms. For example, the smart terminal described in this application can include smart terminals such as mobile phones, tablets, laptops, handheld computers, personal digital assistants (PDAs), portable media players (PMPs), navigation devices, wearable devices, smart bracelets, pedometers, etc., as well as fixed terminals such as digital TVs and desktop computers.
[0080] The following description will use a mobile terminal as an example. Those skilled in the art will understand that, apart from elements specifically designed for mobile purposes, the construction according to the embodiments of this application can also be applied to fixed-type terminals.
[0081] Please see Figure 1 This is a schematic diagram of the hardware structure of a mobile terminal implementing various embodiments of this application. The mobile terminal 100 may include: an RF (Radio Frequency) unit 101, a WiFi module 102, an audio output unit 103, an A / V (Audio / Video) input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, a processor 110, and a power supply 111, etc. Those skilled in the art will understand that... Figure 1 The mobile terminal structure shown does not constitute a limitation on the mobile terminal. The mobile terminal may include more or fewer components than shown, or combine certain components, or have different component arrangements.
[0082] The following is combined with Figure 1 A detailed introduction to each component of the mobile terminal:
[0083] The radio frequency unit 101 can be used for receiving and transmitting signals during information transmission or calls. Specifically, it receives downlink information from the base station and processes it with the processor 110; additionally, it transmits uplink data to the base station. Typically, the radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low-noise amplifier, and a duplexer. Furthermore, the radio frequency unit 101 can also communicate wirelessly with networks and other devices. The aforementioned wireless communications may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA2000 (Code Division Multiple Access 2000), WCDMA (Wideband Code Division Multiple Access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access), FDD-LTE (Frequency Division Duplexing-Long Term Evolution), TDD-LTE (Time Division Duplexing-Long Term Evolution), 5G, and 6G.
[0084] WiFi is a short-range wireless transmission technology. Mobile terminals, through the WiFi module 102, can help users send and receive emails, browse web pages, and access streaming media, providing users with wireless broadband internet access. Although Figure 1 WiFi module 102 is shown, but it is understood that it is not a necessary component of a mobile terminal and can be omitted as needed without changing the nature of the invention.
[0085] The audio output unit 103 can convert audio data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into audio signals and output them as sound when the mobile terminal 100 is in call signal receiving mode, call mode, recording mode, voice recognition mode, broadcast receiving mode, etc. Furthermore, the audio output unit 103 can also provide audio output related to specific functions performed by the mobile terminal 100 (e.g., call signal receiving sound, message receiving sound, etc.). The audio output unit 103 may include a speaker, a buzzer, etc.
[0086] The A / V input unit 104 is used to receive audio or video signals. The A / V input unit 104 may include a graphics processing unit (GPU) 1041 and a microphone 1042. The GPU 1041 processes image data of still images or videos acquired by an image capture device (such as a camera) in video capture mode or image capture mode. The processed image frames can be displayed on the display unit 106. The image frames processed by the GPU 1041 can be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 can receive sound (audio data) in operating modes such as telephone call mode, recording mode, and voice recognition mode, and can process such sound into audio data. The processed audio (voice) data can be converted into a format that can be transmitted to a mobile communication base station via the radio frequency unit 101 in telephone call mode. The microphone 1042 can implement various types of noise cancellation (or suppression) algorithms to eliminate (or suppress) noise or interference generated during the reception and transmission of audio signals.
[0087] The mobile terminal 100 also includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Optionally, the light sensor includes an ambient light sensor and a proximity sensor. Optionally, the ambient light sensor can adjust the brightness of the display panel 1061 according to the ambient light level, and the proximity sensor can turn off the display panel 1061 and / or backlight when the mobile terminal 100 is moved to the ear. As a type of motion sensor, an accelerometer sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when stationary. It can be used for applications that recognize the phone's posture (such as landscape / portrait switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer, tapping), etc. Other sensors that may be configured in the phone, such as fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, and infrared sensors, will not be described in detail here.
[0088] The display unit 106 is used to display information input by the user or information provided to the user. The display unit 106 may include a display panel 1061, which may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
[0089] User input unit 107 can be used to receive input numerical or character information, and generate key signal inputs related to user settings and function control of the mobile terminal. Optionally, user input unit 107 may include touch panel 1071 and other input devices 1072. Touch panel 1071, also known as a touch screen, can collect touch operations performed by the user on or near it (such as operations performed by the user using a finger, stylus, or any suitable object or accessory on or near touch panel 1071), and drive corresponding connection devices according to a pre-set program. Touch panel 1071 may include a touch detection device and a touch controller. Optionally, the touch detection device detects the user's touch position and the signal generated by the touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into touch point coordinates, sends it to processor 110, and can receive and execute commands sent by processor 110. In addition, touch panel 1071 can be implemented using various types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 1071, the user input unit 107 may also include other input devices 1072. Optionally, other input devices 1072 may include, but are not limited to, one or more of the following: physical keyboard, function keys (such as volume control buttons, power buttons, etc.), trackball, mouse, joystick, etc., without being specifically limited here.
[0090] Optionally, the touch panel 1071 may cover the display panel 1061. When the touch panel 1071 detects a touch operation on or near it, it transmits the information to the processor 110 to determine the type of touch event. Subsequently, the processor 110 provides corresponding visual output on the display panel 1061 based on the type of touch event. Although in Figure 1 In this embodiment, the touch panel 1071 and the display panel 1061 are two independent components to realize the input and output functions of the mobile terminal. However, in some embodiments, the touch panel 1071 and the display panel 1061 can be integrated to realize the input and output functions of the mobile terminal. The specific implementation is not limited here.
[0091] Interface unit 108 serves as an interface through which at least one external device can connect to mobile terminal 100. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device with an identification module, an audio input / output (I / O) port, a video I / O port, a headphone port, and so on. Interface unit 108 may be used to receive input (e.g., data, power, etc.) from the external device and transmit the received input to one or more elements within mobile terminal 100, or it may be used to transmit data between mobile terminal 100 and the external device.
[0092] The memory 109 can be used to store software programs and various data. The memory 109 may primarily include a program storage area and a data storage area. Optionally, the program storage area may store the operating system, applications required for at least one function (such as sound playback, image playback, etc.), etc.; the data storage area may store data created based on the use of the mobile phone (such as audio data, phonebook, etc.). Furthermore, the memory 109 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device.
[0093] The processor 110 is the control center of the mobile terminal. It connects various parts of the mobile terminal via various interfaces and lines. By running or executing software programs and / or modules stored in the memory 109, and by calling data stored in the memory 109, it performs various functions and processes data of the mobile terminal, thereby providing overall monitoring of the mobile terminal. The processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor and a modem processor. Optionally, the application processor mainly handles the operating system, user interface, and applications, while the modem processor mainly handles wireless communication. It is understood that the modem processor may not be integrated into the processor 110.
[0094] The mobile terminal 100 may also include a power supply 111 (such as a battery) that supplies power to various components. Preferably, the power supply 111 can be logically connected to the processor 110 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system.
[0095] although Figure 1 As not shown, the mobile terminal 100 may also include a Bluetooth module, etc., which will not be described in detail here.
[0096] To facilitate understanding of the embodiments of this application, the communication network system on which the mobile terminal of this application is based is described below.
[0097] Please see Figure 2 , Figure 2 This application provides a communication network system architecture diagram. The communication network system is an LTE system based on the universal mobile communication technology. The LTE system includes a UE (User Equipment) 201, an E-UTRAN (Evolved UMTS Terrestrial Radio Access Network) 202, an EPC (Evolved Packet Core) 203, and the operator's IP services 204, which are connected in sequence.
[0098] Optionally, UE201 can be the aforementioned terminal 100, which will not be described in detail here.
[0099] E-UTRAN202 includes eNodeB2021 and other eNodeB2022, etc. Optionally, eNodeB2021 can connect to other eNodeB2022 via backhaul (e.g., X2 interface), and eNodeB2021 connects to EPC203, providing access from UE201 to EPC203.
[0100] EPC203 may include MME (Mobility Management Entity) 2031, HSS (Home Subscriber Server) 2032, other MMEs 2033, SGW (Serving Gateway) 2034, PGW (Packet Data Network Gateway) 2035, and PCRF (Policy and Charging Rules Function) 2036, etc. Optionally, MME2031 is the control node that handles signaling between UE201 and EPC203, providing bearer and connection management. HSS2032 is used to provide registers to manage functions such as the Home Location Register (not shown in the figure) and stores user-specific information such as service characteristics and data rates. All user data can be sent through SGW2034. PGW2035 can provide UE 201 IP address allocation and other functions. PCRF2036 is the policy and charging control decision point for service data flow and IP bearer resources. It selects and provides available policy and charging control decisions for the policy and charging enforcement function unit (not shown in the figure).
[0101] IP services 204 may include the Internet, intranet, IMS (IP Multimedia Subsystem), or other IP services.
[0102] Although the above description uses the LTE system as an example, those skilled in the art should know that this application is not only applicable to the LTE system, but also to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA, 5G and future new network systems (such as 6G), etc., without limitation.
[0103] Figure 3This is a schematic diagram of the hardware structure of a controller 140 provided in this application. The controller 140 includes a memory 1401 and a processor 1402. The memory 1401 is used to store program instructions, and the processor 1402 is used to call the program instructions in the memory 1401 to execute the steps performed by the controller in the first embodiment of the above method. The implementation principle and beneficial effects are similar, and will not be described again here.
[0104] Optionally, the controller further includes a communication interface 1403, which can be connected to the processor 1402 via a bus 1404. The processor 1402 can control the communication interface 1403 to implement the receiving and sending functions of the controller 140.
[0105] Figure 4 This application provides a schematic diagram of the hardware structure of a network node 150. The network node 150 includes a memory 1501 and a processor 1502. The memory 1501 is used to store program instructions, and the processor 1502 is used to call the program instructions in the memory 1501 to execute the steps performed by the first node in the first embodiment of the above method. The implementation principle and beneficial effects are similar, and will not be described again here.
[0106] Optionally, the controller further includes a communication interface 1503, which can be connected to the processor 1502 via a bus 1504. The processor 1502 can control the communication interface 1503 to implement the receiving and sending functions of the network node 150.
[0107] The integrated modules described above, implemented as software functional modules, can be stored in a computer-readable storage medium. These software functional modules, stored in a storage medium, include several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute some steps of the methods of the various embodiments of this application.
[0108] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of this application is generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a storage medium or transmitted from one storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state disk, SSD), etc.
[0109] Based on the above-described mobile terminal hardware structure and communication network system, various embodiments of this application are proposed.
[0110] First Embodiment
[0111] Reference Figure 5 , Figure 5 This is a flowchart illustrating the image processing method according to the first embodiment. The image processing method of this application embodiment can be applied to a processing device, including step S1:
[0112] Step S1: Determine or generate a prediction block based on the derived block corresponding to the reference block of at least one block to be predicted.
[0113] In this embodiment, the processing device can be a smart terminal, such as a mobile phone or computer, or a server, such as a local server or a cloud server. This embodiment and this application primarily use a smart terminal as an example for illustration.
[0114] Optionally, the technical solution of this embodiment can be applied to fields such as image encoding and decoding, video encoding and decoding, hardware video encoding and decoding, dedicated circuit video encoding and decoding, and real-time video encoding and decoding.
[0115] Optionally, the processing device can pre-store various images and videos, and can select one image to be predicted from among the images as an image block, or cut the selected image and use the cut image block as the image block to be predicted. Alternatively, it can extract a frame from the video sequence as an image block, or cut the extracted frame to obtain an image block. Or, the processing device receives the input image or video, extracts a frame from the image or video as an image block, or cuts the extracted frame to obtain an image block. Alternatively, the processing device receives images or videos sent by other network devices, extracts a frame from the image or video as an image block, or cuts the extracted frame to obtain an image block. In this case, the processing device establishes a communication connection with the network device on the mobile communication system network side in advance, so that the network device can send images or videos to the terminal device through the communication connection, and the terminal device receives the images or videos.
[0116] Optionally, the technical solution of this embodiment can be performed for intra-frame prediction or inter-frame prediction. No limitation is imposed here.
[0117] Optionally, the reference block is an image block that has already been predicted and / or reconstructed.
[0118] Optionally, the reference block for at least one image block may be other image blocks used to assist in the prediction processing of at least one image block.
[0119] Optionally, the reference block can be a rectangular block, a non-rectangular block, an L-shaped block, or a T-shaped block.
[0120] Optionally, the reference block may include multiple rows and / or columns of pixels, or it may include a row and / or a column of pixels.
[0121] Optionally, the reference block of at least one block to be predicted includes a reference prediction block and / or a reference reconstruction block of the reference block.
[0122] Optionally, at least one reference block corresponds to a derived block, including a derived block corresponding to a reference prediction block and / or a derived block corresponding to a reference reconstruction block.
[0123] Optionally, the reference block is determined or obtained according to at least one of the following methods one through six:
[0124] Method 1: At least one of the following: the top adjacent pixel, the top non-adjacent pixel, the left adjacent pixel, the left non-adjacent pixel, the top left adjacent pixel, the top left non-adjacent pixel, the bottom left adjacent pixel, the bottom left non-adjacent pixel, the top right adjacent pixel, and the top right non-adjacent pixel of the block to be predicted;
[0125] Optionally, the pixel to be predicted in the block to be predicted can be used as the first pixel.
[0126] Optionally, the adjacent pixel above can be a pixel in the same image block as the first pixel in the final segmentation, and / or the pixel is located above and adjacent to the first pixel.
[0127] Optionally, the non-adjacent pixel above can be a pixel that shares the same image block as the first pixel in the same image, which is ultimately divided into the same image blocks. Although the pixel is located above the first pixel, the two pixels are not adjacent.
[0128] Optionally, the left adjacent pixel can be a pixel that shares the same image block as the first pixel in the final image segmentation, and / or the pixel is located to the left of the first pixel and adjacent to the first pixel.
[0129] Optionally, the non-adjacent pixel on the left can be a pixel that shares the same image block as the first pixel in the final image segmentation, even though the pixel is located to the left of the first pixel, the two pixels are not adjacent.
[0130] Optionally, the upper left adjacent pixel can be a pixel in the same image block as the first pixel in the final division, and / or the pixel is located to the upper left of the first pixel and adjacent to the first pixel.
[0131] Optionally, the non-adjacent pixel in the upper left corner can be a pixel that shares the same image block as the first pixel in the final image segmentation. Although the pixel is located to the upper left of the first pixel, the pixel and the first pixel are not adjacent.
[0132] Optionally, the lower left adjacent pixel can be a pixel in the same image block as the first pixel in the final division, and / or the pixel is located to the lower left of the first pixel and adjacent to the first pixel.
[0133] Optionally, the non-adjacent pixel in the lower left corner can be a pixel that shares the same image block as the first pixel in the final image segmentation, and / or the pixel is located in the lower left corner of the first pixel and is not adjacent to the first pixel.
[0134] Optionally, the upper right adjacent pixel can be a pixel in the same image block as the first pixel in the final division, and / or the pixel is located to the upper right of the first pixel and adjacent to the first pixel.
[0135] Optionally, the non-adjacent pixel in the upper right corner can be a pixel that shares the same image block as the first pixel in the final image segmentation, and / or the pixel is located in the upper right corner of the first pixel and is not adjacent to the first pixel.
[0136] Optionally, at least one of the following can be a reconstructed pixel or a predicted pixel: the upper adjacent pixel, the upper non-adjacent pixel, the left adjacent pixel, the left non-adjacent pixel, the upper left adjacent pixel, the upper left non-adjacent pixel, the lower left adjacent pixel, the lower left non-adjacent pixel, the upper right adjacent pixel, and the upper right non-adjacent pixel.
[0137] Optionally, at least one of the following can be used as a pixel in the reference block: the upper adjacent pixel, the upper non-adjacent pixel, the left adjacent pixel, the left non-adjacent pixel, the upper left adjacent pixel, the upper left non-adjacent pixel, the lower left adjacent pixel, the lower left non-adjacent pixel, the upper right adjacent pixel, and the upper right non-adjacent pixel. Alternatively, at least one obtained pixel can be deduced or calculated to obtain the pixel in the reference block, and / or the reference block can be determined or obtained based on the pixel in the reference block.
[0138] In this embodiment, a reference block is determined or generated based on at least one of the following: the top adjacent pixel, the top non-adjacent pixel, the left adjacent pixel, the left non-adjacent pixel, the top left adjacent pixel, the top left non-adjacent pixel, the bottom left adjacent pixel, the bottom left non-adjacent pixel, the top right adjacent pixel, and the top right non-adjacent pixel. A prediction block is then determined or generated based on the derivative block corresponding to at least one reference block. Specifically, during prediction processing, the corresponding derivative block can be determined based on an accurate and effective reference block. Using this derivative block for prediction processing can improve the prediction effect of the prediction processing.
[0139] Method 2: The block to be predicted must be at least one of the following: neighboring block, non-neighboring block, cross-component block, co-located block, temporal block, and default block.
[0140] Optionally, the default block can be a pre-set block, such as a block with typical pixel characteristics pre-set by the encoder and / or decoder.
[0141] Optionally, the neighboring block can be a block adjacent to the block to be predicted, and / or a block that has already been predicted or reconstructed.
[0142] Optionally, a non-neighbor block can be a block that is not adjacent to the block to be predicted, and / or a block that has already been predicted or reconstructed.
[0143] Optionally, the co-position block can be an image block in the co-position image that has the same position and size as the block to be predicted. Optionally, the co-position image can be the image in the reference image that is temporally closest to the current image.
[0144] Optionally, the temporal block can be a block that is distinguished in the time domain, such as an image block in the previous frame. For example, if there is video data containing three frames of images, the first frame is played in the first second, the second frame is played in the second second, and the third frame is played in the third second, if the image block to be predicted at the current moment (such as the block to be predicted) is an image block after the second frame is divided, then the temporal block can be determined to be the image block corresponding to it in the first frame.
[0145] Optionally, the cross-component block can be an image block in a different component than the current at least one image block to be predicted. For example, if the current at least one image block to be predicted is a Y component, then the cross-component block can be an image block containing the U and / or V components. Optionally, if the current at least one image block to be predicted is a U component, then the cross-component block can be an image block containing the Y and / or V components. Optionally, if the current at least one image block to be predicted is a V component, then the cross-component block can be an image block containing the U and / or Y components.
[0146] Optionally, at least one pixel can be obtained from at least one of the following: neighboring blocks, non-neighboring blocks, cross-component blocks, co-located blocks, temporal blocks, and default blocks corresponding to the block to be predicted. This at least one obtained pixel can be used as a pixel in the reference block, or the at least one obtained pixel can be derived or calculated to obtain the pixel in the reference block. The reference block is then determined or obtained based on the pixels in the reference block.
[0147] Optionally, at least one of the following can be determined from at least one of the neighboring blocks, non-neighboring blocks, cross-component blocks, co-position blocks, temporal blocks, and default blocks corresponding to the block to be predicted: the upper adjacent pixel, the upper non-adjacent pixel, the left adjacent pixel, the left non-adjacent pixel, the upper left adjacent pixel, and the upper left non-adjacent pixel, and used as the pixels of the reference block. Alternatively, the corresponding adjacent regions and / or non-adjacent regions can be determined from at least one of the neighboring blocks, non-neighboring blocks, cross-component blocks, co-position blocks, temporal blocks, and default blocks corresponding to the block to be predicted, and at least one pixel can be selected from them as the pixels of the reference block. The reference block is determined or obtained based on the pixels in the reference block.
[0148] In this embodiment, a reference block is determined or generated based on at least one of the neighboring blocks, non-neighboring blocks, cross-component blocks, co-position blocks, temporal blocks, and default blocks corresponding to the block to be predicted. A prediction block is determined or generated based on the derived block corresponding to at least one reference block. Specifically, when performing prediction processing, the corresponding derived block can be determined based on an accurate and effective reference block. Using the derived block for prediction processing can improve the prediction effect of the prediction processing.
[0149] Method 3: At least one of the following: width, height, block size, and block area of the block to be predicted;
[0150] Optionally, at least one reference block is determined or obtained based on at least one of the width, height, block size, and block area of at least one block to be predicted.
[0151] Optionally, for example, if at least one of the width, height, block size, and block area of at least one block to be predicted is greater than a preset threshold, then at least one image block is selected as a reference block from the neighboring blocks, non-neighboring blocks, cross-component blocks, co-location blocks, temporal blocks, and default blocks corresponding to at least one block to be predicted.
[0152] Alternatively, at least one of the width, height, block size, and block area of at least one block to be predicted can be input into a neural network used to determine a reference block, and the output can be a reference block.
[0153] In this embodiment, a reference block is determined or generated based on at least one of the width, height, block size, and block area of the image block, and a prediction block is determined or generated based on the derivative block corresponding to at least one reference block. Specifically, during prediction processing, the corresponding derivative block can be determined based on an accurate and effective reference block, and the prediction effect can be improved by using the derivative block for prediction processing.
[0154] Method 4: Candidate motion vectors or candidate block vectors of the block to be predicted are used to determine or generate candidate blocks;
[0155] Optionally, the candidate motion vector or candidate block vector of the block to be predicted may include the motion vector or block vector corresponding to at least one of the neighboring blocks, non-neighboring blocks, cross-component blocks, co-position blocks, temporal blocks, and default blocks corresponding to the block to be predicted. It may also include the motion vector or block vector corresponding to at least one of the above adjacent pixels, above non-adjacent pixels, left adjacent pixels, left non-adjacent pixels, upper left adjacent pixels, and upper left non-adjacent pixels of the block to be predicted. It may also include the motion vector or block vector of the block to be predicted, etc. The following only uses the motion vector or block vector of the block to be predicted as an example.
[0156] Optionally, block vector calculation is performed on the block to be predicted, and candidate blocks corresponding to the block to be predicted are determined based on the block vector calculation results. For example, the pixels corresponding to the block vector calculation results are used as pixels in the candidate blocks.
[0157] Optionally, motion vectors are calculated for the block to be predicted, and candidate blocks corresponding to the block to be predicted are determined based on the motion vector calculation results. For example, the pixels corresponding to the motion vector calculation results are used as pixels in the candidate blocks.
[0158] In this embodiment, a reference block is determined or generated by the candidate motion vector or candidate block vector of the block to be predicted. Based on the derivative block corresponding to at least one reference block, a prediction block is determined or generated. Specifically, during prediction processing, the corresponding derivative block can be determined based on an accurate and effective reference block. Using the derivative block for prediction processing can improve the prediction effect of the prediction processing.
[0159] Method 5: If the first information of the block to be predicted satisfies the first condition, then the reference block is the first reference block;
[0160] Optionally, the first information of the block to be predicted can be at least one of the methods described in Method 1 to Method 7 above.
[0161] Optionally, the first condition can be any condition set by the user in advance, such as the width of the block to be predicted being greater than a preset width threshold, or the height of the block to be predicted being greater than a preset height threshold.
[0162] Optionally, the first reference block can be a pre-defined reference block, such as at least one of the following: neighboring block, non-neighboring block, cross-component block, co-position block, temporal block, candidate block, and default block corresponding to the block to be predicted.
[0163] Optionally, when the first information of the block to be predicted satisfies the first condition, the first reference block of the reference block can be determined, and the prediction block can be determined or generated based on the derived block corresponding to the first reference block of at least one block to be predicted.
[0164] In this embodiment, if the first information of the block to be predicted meets the first condition, then the reference block is the first reference block. The prediction block is determined or generated based on the derivative block corresponding to the first reference block. Specifically, when performing prediction processing, the corresponding derivative block can be determined based on the accurate and effective reference block. The prediction effect of the prediction processing can be improved by using the derivative block for prediction processing.
[0165] Method 6: If the first information of the block to be predicted satisfies the second condition, then the reference block is the second reference block.
[0166] Optionally, the second condition can be any condition set by the user in advance, and / or can be different from the second condition. For example, if the first condition is that the width of the block to be predicted is greater than the preset width threshold, then the second condition can be that the width of the block to be predicted is less than the preset width threshold.
[0167] Optionally, the second reference block can be a pre-defined reference block, such as a block other than the first reference block among at least one of the following: neighboring blocks, non-neighboring blocks, cross-component blocks, co-position blocks, temporal blocks, candidate blocks, and default blocks corresponding to the block to be predicted.
[0168] In this embodiment, if the first information of the block to be predicted satisfies the second condition, then the reference block is the second reference block. The prediction block is determined or generated based on the derivative block corresponding to the second reference block. Thus, when performing prediction processing, the corresponding derivative block can be determined based on the accurate and effective reference block, and the prediction processing can be performed using the derivative block, thereby improving the prediction effect of the prediction processing.
[0169] Optionally, in step S1, the predicted block can be an image block that has undergone prediction processing.
[0170] Optionally, the prediction block may include at least one predicted pixel, or it may include pixels associated with at least one predicted pixel.
[0171] Optionally, pixel prediction can be performed on the encoding side as predicted pixels, or simply predicted pixels. Pixel reconstruction can be performed on the decoding side as predicted pixels, or simply predicted pixels.
[0172] Optionally, the processing device may be a decoding end, and if at the decoding end, the prediction block may be a decoded image block, and / or the prediction block may include pixel reconstruction.
[0173] Optionally, the processing device may be an encoding end, whereby the prediction block may be an image block that has undergone prediction processing, and / or the prediction block may include the prediction of pixels.
[0174] Optionally, for at least one block to be predicted, the derived block corresponding to the reference block of the at least one block to be predicted can be determined first, and the prediction module can be used in combination with the derived block to predict the block to be predicted, thereby determining or generating the prediction block. The prediction module can use a lookup table, a neural network, or a mathematical function, etc., without any restrictions.
[0175] In this embodiment, by taking into account the derived blocks of at least one reference block of the block to be predicted when determining or generating the prediction block, the prediction effect of the prediction process can be improved.
[0176] Second Embodiment
[0177] Based on the first embodiment, a second embodiment is proposed.
[0178] In this embodiment, the image processing method further includes at least one of the following methods seven to ten:
[0179] Method 7, the derived block includes at least one of the following: luminance component derived block, chrominance component derived block, and cross-component derived block;
[0180] Optionally, at least one reference block includes a reference prediction block of the reference block, and at least one reference prediction block includes a luminance component derivative block of the reference prediction block in the luminance component, a chrominance component derivative block of the reference prediction block in the chrominance component, and a cross-component derivative block of the reference prediction block. The cross-component derivative block may include derivative blocks of multiple components, such as at least one of the derivative blocks of the reference prediction block in the Y component, U component, and V component.
[0181] Optionally, at least one reference block includes a reference reconstruction block of the reference block, and at least one derivative block of the reference reconstruction block includes a luminance component derivative block of the reference reconstruction block in the luminance component, a chrominance component derivative block of the reference reconstruction block in the chrominance component, and a cross-component derivative block of the reference reconstruction block. The cross-component derivative block may include derivative blocks of multiple components, such as at least one of the derivative blocks of the reference reconstruction block in the Y component, U component, and V component.
[0182] Optionally, the reference prediction block and / or reference reconstruction block can apply the chroma component derivative block of the chroma component to the luminance component, and the reference prediction block and / or reference reconstruction block can apply the luminance component derivative block of the luminance component to the chroma component.
[0183] In this embodiment, by determining at least one of the following derivative blocks corresponding to the reference prediction block and / or the reference reconstruction block of at least one reference block: a luminance component derivative block, a chrominance component derivative block, and a cross-component derivative block, derivative blocks can be obtained from different components. When determining or generating a prediction block, the derivative blocks of the reference block of at least one block to be predicted are comprehensively considered, which can improve the prediction effect of the prediction process.
[0184] Method 8, the luminance component derivative block includes at least one of a first horizontal derivative block, a first vertical derivative block, and a first horizontal-vertical mixed derivative block;
[0185] Optionally, corresponding derived blocks can be obtained from different directions on the luminance component, such as a first horizontal derived block in the horizontal direction, a first vertical derived block in the vertical direction, and a first horizontal and vertical mixed derived block in the horizontal and vertical mixed direction.
[0186] Optionally, the luminance component derivative block of the reference prediction block may include a first horizontal derivative block of the reference prediction block in the horizontal direction of the luminance component, a first vertical derivative block in the vertical direction of the luminance component, and a first horizontal and vertical mixed derivative block in the horizontal and vertical mixing direction of the luminance component.
[0187] Optionally, the luminance component derivative block of the reference reconstruction block may include a first horizontal derivative block of the reference reconstruction block in the horizontal direction of the luminance component, a first vertical derivative block in the vertical direction of the luminance component, and a first horizontal and vertical blending derivative block in the horizontal and vertical blending direction of the luminance component.
[0188] In this embodiment, by determining at least one of the following: a reference prediction block and / or a reference reconstruction block corresponding to a reference block, the luminance derivative blocks include at least one of a first horizontal derivative block, a first vertical derivative block, and a first horizontal-vertical hybrid derivative block. This enables the acquisition of derivative blocks in the luminance component from the horizontal direction, the vertical direction, and the horizontal-vertical hybrid direction. When determining or generating a prediction block, the derivative blocks of at least one reference block to be predicted are comprehensively considered, which can improve the prediction effect of the prediction process.
[0189] Method 9, the chromaticity component derived block includes at least one of a second horizontal derived block, a second vertical derived block, and a second horizontal and vertical mixed derived block;
[0190] Optionally, corresponding derived blocks can be obtained from different directions on the chromaticity components, such as a second horizontal derived block in the horizontal direction, a second vertical derived block in the vertical direction, and a second horizontal and vertical mixed derived block in the horizontal and vertical mixing direction.
[0191] Optionally, the chromaticity component derived block of the reference prediction block may include a second horizontal derived block of the reference prediction block in the horizontal direction of the chromaticity component, a second vertical derived block in the vertical direction of the chromaticity component, and a second horizontal and vertical mixed derived block in the horizontal and vertical mixing direction of the chromaticity component.
[0192] Optionally, the chroma component derivative block of the reference reconstruction block may include a second horizontal derivative block of the reference reconstruction block in the horizontal direction of the chroma component, a second vertical derivative block in the vertical direction of the chroma component, and a second horizontal and vertical blending derivative block in the horizontal and vertical blending direction of the chroma component.
[0193] In this embodiment, by determining at least one of the following: a reference prediction block and / or a reference reconstruction block corresponding to a reference block, the chroma component derivative blocks include at least one of a second horizontal derivative block, a second vertical derivative block, and a second horizontal-vertical mixed derivative block. This enables the acquisition of derivative blocks in the chroma components from the horizontal direction, the vertical direction, and the horizontal-vertical mixed direction. When determining or generating a prediction block, the derivative blocks of at least one reference block to be predicted are comprehensively considered, which can improve the prediction effect of the prediction process.
[0194] Method 10: The cross-component derivative block includes at least one of the third horizontal derivative block, the third vertical derivative block, and the third horizontal-vertical hybrid derivative block.
[0195] Optionally, corresponding derived blocks can be obtained from different directions across components, such as a third horizontal derived block in the horizontal direction, a third vertical derived block in the vertical direction, and a third horizontal-vertical hybrid derived block in the horizontal-vertical hybrid direction.
[0196] Optionally, the cross-component derived blocks of the reference prediction block may include a third horizontal derived block obtained from the horizontal direction, a third vertical derived block obtained from the vertical direction, and a third horizontal-vertical hybrid derived block obtained from the horizontal-vertical hybrid direction, respectively, in the cross-component.
[0197] Optionally, the cross-component derived blocks of the reference reconstruction block may include a third horizontal derived block obtained from the horizontal direction, a third vertical derived block obtained from the vertical direction, and a third horizontal-vertical hybrid derived block obtained from the horizontal-vertical hybrid direction, respectively, in the cross component.
[0198] In this embodiment, by determining at least one of the following cross-component derived blocks corresponding to the reference prediction block and / or reference reconstruction block of at least one reference block: a third horizontal derived block, a third vertical derived block, and a third horizontal-vertical hybrid derived block, it is possible to obtain derived blocks in the horizontal direction, the vertical direction, and the horizontal-vertical hybrid direction in the cross component. When determining or generating a prediction block, the derived blocks of the reference block of at least one block to be predicted are comprehensively considered, which can improve the prediction effect of the prediction process.
[0199] Third Embodiment
[0200] Based on any of the above embodiments, a third embodiment is proposed.
[0201] In this embodiment, refer to Figure 6 Step S1 includes steps S11 and S12:
[0202] Step S11: Determine or obtain at least one derived block based on the reference reconstruction block and / or reference prediction block of at least one reference block;
[0203] Optionally, a derivative block corresponding to at least one reference reconstruction block can be determined or obtained based on at least one reference reconstruction block, and / or a corresponding derivative block can be obtained from different components and different directions, such as at least one of the luminance component derivative block, chrominance component derivative block, and cross-component derivative block of at least one reference reconstruction block, such as a first horizontal derivative block, a first vertical derivative block, and a first horizontal-vertical mixed derivative block obtained by at least one reference reconstruction block from the horizontal direction, the vertical direction, and the horizontal-vertical mixed direction in the luminance component, a second horizontal derivative block, a second vertical derivative block, and a second horizontal-vertical mixed derivative block obtained by at least one reference reconstruction block from the horizontal direction, the vertical direction, and the horizontal-vertical mixed direction in the chrominance component, and a third horizontal derivative block, a third vertical derivative block, and a third horizontal-vertical mixed derivative block obtained by at least one reference reconstruction block from the horizontal direction, the vertical direction, and the horizontal-vertical mixed direction in the cross-component.
[0204] Optionally, a derivative block corresponding to at least one reference prediction block can be determined or obtained based on at least one reference prediction block, and / or a corresponding derivative block can be obtained from different components and different directions, such as at least one of the luminance component derivative block, chrominance component derivative block, and cross-component derivative block of at least one reference prediction block, such as a first horizontal derivative block, a first vertical derivative block, and a first horizontal-vertical mixed derivative block obtained by at least one reference prediction block from the horizontal direction, the vertical direction, and the horizontal-vertical mixed direction on the luminance component, respectively; a second horizontal derivative block, a second vertical derivative block, and a second horizontal-vertical mixed derivative block obtained by at least one reference prediction block from the horizontal direction, the vertical direction, and the horizontal-vertical mixed direction on the chrominance component, respectively; and a third horizontal derivative block, a third vertical derivative block, and a third horizontal-vertical mixed derivative block obtained by at least one reference prediction block from the horizontal direction, the vertical direction, and the horizontal-vertical mixed direction on the cross component, respectively.
[0205] Optionally, the size parameters of the derived block are matched with the size parameters of the reference reconstructed block and / or the reference predicted block, and / or, step S11 includes steps a1 and a2:
[0206] Step a1: Crop a portion of at least one reference reconstruction block and / or reference prediction block;
[0207] Optionally, at least one reference reconstruction block may be identified as having a portion of its area to be clipped, and clipping may be performed. Alternatively, at least one reference prediction block may be identified as having a portion of its area to be clipped, and clipping may be performed.
[0208] Optionally, the size parameters of a portion of at least one reference reconstruction block and / or reference prediction block can be determined, and the portion of at least one reference reconstruction block and / or reference prediction block can be determined based on the size parameters of the portion of the portion. For example, if there is a reference prediction block a with three rows of pixel regions, and the size parameters of a portion of the reference prediction block a are the size parameters of the top row of pixel regions, then the top row of pixel regions of the reference prediction block a can be cropped.
[0209] Optionally, before performing prediction processing on at least one reference prediction block, the reference reconstruction block and / or reference prediction block of at least one reference block, and their corresponding derived blocks, may be transformed, such as at least one of the following: DCT (Discrete Cosine Transform), DST (Discrete Sine Transform), KL (Karhunen–Loève Transform), Wavelet Transform, and Hadamard Transform.
[0210] Optionally, a preset transformation size parameter corresponding to at least one reference prediction block and / or reference reconstruction block can be determined, and the size parameter of a portion of the at least one reference prediction block and / or reference reconstruction block can be determined or obtained based on the preset transformation size parameter.
[0211] Optionally, the size parameters of some areas are smaller than the preset transformation size parameters.
[0212] Optionally, the size parameters of a partial region may include the size parameters of the left partial region, the size parameters of the right partial region, the sum of the size parameters of the left and right partial regions, the size parameters of the upper partial region, the size parameters of the lower partial region, the sum of the size parameters of the upper and lower partial regions, the sum of the size parameters of the left, right, upper and lower partial regions, etc.
[0213] Optionally, the dimensional parameters may include at least one of width, height, area, and perimeter.
[0214] Optionally, the preset transformation size parameters can be dynamically changing or statically fixed, and there is no restriction here.
[0215] Optionally, the preset transformation dimension parameter can be a transformation dimension parameter, which includes at least one of transformation width and transformation height.
[0216] Optionally, the transformation width is greater than or equal to the number of columns that are cropped and / or filled in a portion of at least one reference prediction block and / or reference reconstruction block;
[0217] Optionally, the transformation height is greater than or equal to the number of rows in which a portion of the region of at least one reference prediction block and / or reference reconstruction block is clipped and / or filled.
[0218] Optionally, the preset transformation size parameters can be the same as or different from the size parameters of the reference prediction block and / or the reference reconstruction block. Optionally, the preset transformation size parameters can also be set in advance, without any restrictions.
[0219] Optionally, at least one reference reconstruction block and / or a portion of a reference prediction block is determined or obtained according to at least one of methods eleven to fourteen:
[0220] Method 11: At least one of the following: at least one reference reconstruction block and / or at least one of the leftmost column, at least one rightmost column, at least one top row, and at least one bottom row of the reference prediction block;
[0221] Optionally, a portion of the reference reconstruction block can be determined or obtained based on at least one column on the far left of the reference reconstruction block.
[0222] Optionally, a portion of the at least one reference reconstruction block can be determined or obtained based on at least one column on the far right of the at least one reference reconstruction block.
[0223] Optionally, a portion of the at least one reference reconstruction block can be determined or obtained based on at least one row at the top of the at least one reference reconstruction block.
[0224] Optionally, a portion of the reference reconstruction block can be determined or obtained based on at least one row at the bottom of the reference reconstruction block.
[0225] Optionally, a portion of the reference prediction block can be determined or obtained based on at least one column from the leftmost side of the reference prediction block.
[0226] Optionally, a portion of the reference prediction block can be determined or obtained based on at least one column on the far right of the reference prediction block.
[0227] Optionally, a portion of a reference prediction block can be determined or obtained based on at least one row at the top of at least one reference prediction block.
[0228] Optionally, a portion of the reference prediction block can be determined or obtained based on at least one row at the bottom of the reference prediction block.
[0229] For example, when the reference reconstruction block and / or reference prediction block is a 2x2 image block, the top row of the reference reconstruction block and / or reference prediction block can be used as a partial region.
[0230] In this embodiment, by determining at least one of the leftmost column, the rightmost column, the topmost row, and the bottommost row of at least one reference reconstruction block and / or reference prediction block as the part to be clipped, and then performing clipping, it can be ensured that the clipped part is located at the edge position of at least one reference reconstruction block and / or reference prediction block, thereby ensuring the accuracy of the derived blocks determined or obtained based on the clipped part and thus improving the prediction effect.
[0231] Method 12: Change the dimensional parameters;
[0232] Optionally, the transformation size parameters include at least one of the transformation width and transformation height.
[0233] Optionally, the transformation width is greater than or equal to the number of columns that are cropped and / or filled in a portion of at least one reference reconstruction block and / or reference prediction block;
[0234] Optionally, the transformation height is greater than or equal to the number of rows in which a portion of the region of at least one reference reconstruction block and / or reference prediction block is clipped and / or filled.
[0235] Optionally, the size parameter of a portion of at least one reference reconstruction block and / or reference prediction block is smaller than the transformation size parameter. Therefore, the size parameter of a portion of at least one reference reconstruction block can be determined or obtained based on the transformation size parameter of at least one reference reconstruction block. Then, the portion of at least one reference reconstruction block can be determined or obtained based on the size parameter of a portion of at least one reference reconstruction block, such as at least one of the leftmost column, the rightmost column, the topmost row, and the bottommost row of at least one reference reconstruction block.
[0236] Optionally, the size parameters of a portion of the at least one reference prediction block can be determined or obtained based on the transformation size parameters of the at least one reference prediction block, and then the portion of the at least one reference prediction block can be determined or obtained based on the size parameters of the portion of the at least one reference prediction block, such as at least one of the leftmost column, the rightmost column, the topmost row, and the bottommost row of the at least one reference prediction block.
[0237] In this embodiment, by determining or obtaining a partial region of at least one reference reconstruction block and / or reference prediction block based on the transformation size parameters, the accuracy of the determined partial region is ensured, thereby ensuring the accuracy of the derived block subsequently determined or obtained based on the cropped partial region, and thus improving the prediction effect.
[0238] Method Thirteen: Cutting Size Parameters;
[0239] Optionally, the cropping size parameter can be a parameter used to crop a portion of at least one image patch.
[0240] Optionally, a portion of at least one reference reconstruction block and / or reference prediction block can be clipped based on clipping size parameters.
[0241] Optionally, the cutting size parameter can be a dynamically changing parameter or a statically fixed parameter.
[0242] Optionally, the cutting size parameters can be determined or obtained by preset transformation size parameters, or they can be preset parameters in advance.
[0243] Optionally, the cutting dimension parameter is less than or equal to the transformation dimension parameter.
[0244] Optionally, the clipping size parameters may include the clipping width and / or height, and may include the number of rows and / or columns to be clipped from at least one of the reference prediction block and the reference reconstruction block of the reference block.
[0245] Optionally, at least one reference reconstruction block and / or a portion of a reference prediction block can be determined or obtained based on the clipping size parameters.
[0246] Optionally, the number of rows and / or columns in the clipping size parameters that require clipping of at least one reference reconstruction block and / or reference prediction block can be used as the number of rows and / or columns contained in the partial region.
[0247] Optionally, the cutting size parameters can be the same as the size parameters of a portion of the area.
[0248] In this embodiment, at least a portion of the region is determined based on the cropping size parameters, making the determination of the portion of the region closely related to the cropping action. This ensures the accuracy of the determined portion of the region, thereby ensuring the accuracy of the subsequent determination or acquisition of derived blocks based on the portion of the region. This allows the predictor to effectively capture the transition features between image blocks based on the derived blocks, thereby improving the prediction effect of the prediction process.
[0249] Method Fourteen: The sliding area size and / or the preset sliding step size of the sliding window on at least one reference reconstruction block and / or reference prediction block.
[0250] Optionally, the sliding window can be a window in the processing device that can slide on at least one reference reconstruction block and / or reference prediction block, and can slide on the displayed reference reconstruction block and / or reference prediction block in response to the user's drag trajectory.
[0251] Optionally, the preset sliding step size can be the step size that the sliding window needs to slide, or it can be set in advance.
[0252] Optionally, the sliding region can be the effective area of the sliding window. The effective region can be the area generated by the sliding window sliding over at least one image block, and / or a reference prediction block, and / or a reference reconstruction block.
[0253] Optionally, the sliding area size can be the dimensions of the sliding area, such as width, height, area, perimeter, etc.
[0254] Optionally, the sliding area of the sliding window that slides on at least one reference reconstruction block according to a preset sliding step can be used as a part of the reference reconstruction block, and the sliding area of the sliding window that slides on at least one reference prediction block according to a preset sliding step can be used as a part of the reference prediction block.
[0255] Optionally, the sliding area corresponding to the sliding area size of the sliding window sliding on at least one reference reconstruction block can be used as a part of the reference reconstruction block.
[0256] Optionally, the sliding area corresponding to the sliding area size of the sliding window sliding on at least one reference prediction block can be used as a part of the reference prediction block.
[0257] Optionally, the sliding region of the sliding window that slides on at least one reference reconstruction block and / or reference prediction block according to a preset sliding step can be processed to obtain a partial region of at least one reference reconstruction block and / or reference prediction block. For example, the region where the sliding region has a high degree of overlap with the sub-image block in the at least one reference reconstruction block is located can be selected as the partial region. Alternatively, the middle part of the sliding region can be removed to obtain the partial region.
[0258] Optionally, at least a portion of the region is determined or obtained based on the sliding area of the sliding window sliding on at least one reference prediction block according to a preset step size, and the at least a portion of the region is cropped.
[0259] Optionally, at least a portion of the region is determined or obtained based on the sliding area of the sliding window sliding on at least one reference reconstruction block according to a preset step size, and the at least a portion of the region is clipped.
[0260] In this embodiment, at least a portion of the region is determined or obtained by sliding a window over at least one reference reconstruction block and / or a reference prediction block, and / or by the preset sliding step size of the window. This ensures that the determination of the portion of the region is closely related to the user's needs, thereby guaranteeing the validity of the determined portion of the region. This, in turn, ensures the accuracy of the subsequent determination or acquisition of the derived block based on the portion of the region. This allows the predictor to effectively capture the transition features between image blocks based on the derived block, thereby improving the prediction effect of the prediction process.
[0261] Step a2: Based on the filling results of filling at least one reference reconstruction block and / or reference prediction block after clipping, determine or obtain at least one derived block.
[0262] Optionally, for a reference reconstruction block, if a portion of the reference reconstruction block is determined or obtained according to at least one of the above methods eleven to fourteen, the portion of the reference reconstruction block can be cropped, and then the cropped reference reconstruction block can be pixel-filled. Based on the pixel-filled reference reconstruction block, the corresponding derivative block can be determined or obtained, for example, the pixel-filled reference reconstruction block can be directly used as the corresponding derivative block.
[0263] Optionally, for a reference prediction block, if a portion of the reference prediction block is determined or obtained according to at least one of the above methods eleven to fourteen, the portion of the reference prediction block can be cropped, and then the cropped reference prediction block can be pixel-filled. Based on the pixel-filled reference prediction block, the corresponding derivative block can be determined or obtained, for example, the pixel-filled reference prediction block can be directly used as the corresponding derivative block.
[0264] Optionally, the filling method for the clipped reference reconstruction block and / or reference prediction block includes at least one of zero filling and mirror symmetric filling.
[0265] Alternatively, zero padding can be achieved by expanding the size of at least one image block by adding zero values to the edges of at least one reference reconstruction block and / or reference prediction block.
[0266] Alternatively, mirror-symmetric padding can be achieved by using the mirror image of the edge pixels of at least one reference reconstruction block and / or reference prediction block as the padding pixels.
[0267] Optionally, the at least one reference reconstruction block after being clipped can be filled according to the size parameters of at least one reference reconstruction block, so that the size parameters of the filled reference reconstruction block are the same as the size parameters of the at least one reference reconstruction block.
[0268] Optionally, the at least one reference prediction block after being clipped can be filled according to the size parameters of at least one reference prediction block, so that the size parameters of the filled reference prediction block are the same as the size parameters of the at least one reference prediction block.
[0269] Optionally, the size parameters of the derived block corresponding to the reference reconstruction block are the same as the size parameters of the reference reconstruction block.
[0270] Optionally, the size parameters of the derived block corresponding to the reference prediction block are the same as the size parameters of the reference prediction block.
[0271] Optionally, when a portion of the reference reconstruction block and / or reference prediction block of at least one reference block is the leftmost column and / or the rightmost column of the at least one reference reconstruction block and / or reference prediction block, the leftmost column and / or the rightmost column of the at least one reference reconstruction block and / or reference prediction block can be clipped to obtain the clipped at least one reference reconstruction block and / or reference prediction block.
[0272] Optionally, the leftmost n1 column and the rightmost m1 column of at least one reference reconstruction block and / or reference prediction block are clipped, where n1+m1 is the transformation width.
[0273] Optionally, column n1 is at least one column, and column m1 is at least one column.
[0274] Optionally, at least one reference reconstruction block and / or at least one leftmost column and at least one rightmost column of the reference prediction block can be truncated simultaneously, or truncated in a certain order, without any restriction.
[0275] Optionally, at least one reference reconstruction block and / or reference prediction block may be truncated in the horizontal and / or vertical directions.
[0276] Optionally, at least one column on the leftmost side and / or at least one column on the rightmost side of at least one reference prediction block can be cropped, and the cropped reference prediction block can be filled. A derived block can be determined or obtained based on the filled reference prediction block, and prediction processing can be performed based on the at least one derived block.
[0277] Optionally, at least one column on the leftmost side and / or at least one column on the rightmost side of at least one reference reconstruction block can be clipped, and the clipped reference reconstruction block can be filled. A derived block can be determined or obtained based on the filled reference reconstruction block, and then prediction processing can be performed based on the at least one derived block.
[0278] Optionally, when a portion of the reference reconstruction block and / or reference prediction block of at least one reference block is at least one row at the top and / or at least one row at the bottom of the at least one reference reconstruction block and / or reference prediction block, the at least one row at the top and / or at least one row at the bottom of the at least one reference reconstruction block and / or reference prediction block can be cropped to obtain the cropped at least one reference reconstruction block and / or reference prediction block.
[0279] Optionally, the top n2 rows and bottom m2 rows of at least one reference reconstruction block and / or reference prediction block are truncated, where n2+m2 is the transformation height.
[0280] Optionally, n2 lines must contain at least one line, and m2 lines must contain at least one line.
[0281] Optionally, at least one reference reconstruction block and / or at least one row at the top and at least one row at the bottom of the reference prediction block can be pruned simultaneously, or pruning can be performed in a certain order, without any restrictions.
[0282] Optionally, at least one reference reconstruction block and / or reference prediction block may be truncated in the horizontal and / or vertical directions.
[0283] Optionally, at least one row at the top and / or at least one row at the bottom of at least one reference prediction block can be cropped, and the cropped reference prediction block can be filled. A derived block can be determined or obtained based on the filled reference prediction block, and prediction processing can be performed based on the at least one derived block.
[0284] Optionally, at least one row at the top and / or at least one row at the bottom of at least one reference reconstruction block can be clipped, and the clipped reference reconstruction block can be filled. A derived block can be determined or obtained based on the filled reference reconstruction block, and then prediction processing can be performed based on the at least one derived block.
[0285] In this embodiment, by cropping and filling a portion of at least one reference reconstruction block and / or reference prediction block, a derived block is determined or obtained. This ensures that the derived block is closely related to the original reference reconstruction block and / or reference prediction block, thus guaranteeing the accuracy of the derived block. Consequently, the predictor effectively captures the transition features between image blocks based on the derived block, thereby improving the prediction effect of the prediction process.
[0286] Optionally, the filling method for filling at least one reference reconstruction block and / or reference prediction block after clipping includes at least one of steps b1 to b2:
[0287] Step b1: Fill the leftmost or rightmost column of pixels with an even number of pixels for at least one reference reconstruction block and / or reference prediction block after cropping;
[0288] Optionally, the number of fill columns for pixel fill is equal to the transform width.
[0289] Optionally, if at least one leftmost column and / or at least one rightmost column of at least one reference reconstruction block and / or reference prediction block are cropped, and / or the total number of cropped columns is even, then the leftmost or rightmost column of the cropped at least one reference reconstruction block and / or reference prediction block can be padded with even-numbered pixels, for example, by filling the pixels with zero values, based on the size parameters of the at least one reference reconstruction block and / or reference prediction block, so that the size parameters of the padded reference prediction block are the same as the size parameters of the at least one reference prediction block, and the size parameters of the padded reference reconstruction block are the same as the size parameters of the at least one reference reconstruction block.
[0290] For example, if the leftmost and rightmost columns of at least one reference reconstruction block and / or reference prediction block are cropped, the rightmost column of the cropped reference reconstruction block and / or reference prediction block can be filled with two columns of pixels, and / or the filling method can be zero-filling or mirror-symmetric filling.
[0291] Optionally, based on the size parameters of at least one reference prediction block, the leftmost or rightmost column of pixels of the cropped reference prediction block is filled with even-numbered pixels to obtain a filled reference prediction block. A derivative block is determined or obtained based on the filled reference prediction block, and prediction processing is performed based on at least one derivative block.
[0292] Optionally, based on the size parameters of at least one reference reconstruction block, the leftmost or rightmost column of pixels of the cropped reference reconstruction block is filled to obtain the filled reference reconstruction block. A derivative block is determined or obtained based on the filled reference reconstruction block, and prediction processing is performed based on at least one derivative block.
[0293] For example, such as Figure 7As shown, the boundary region of the reconstructed block of at least one image block can be cropped, and two columns of zero padding can be performed on the rightmost side to obtain the first horizontal derived block. Figure 7 In the image block, the region with a pixel value of 0 is the region after zero padding, and the region with a non-zero pixel value (such as 2, 63, etc.) is the region in the reconstructed block that has not been cropped and padded.
[0294] In this embodiment, by filling the leftmost or rightmost column of pixels of at least one reference reconstruction block and / or reference prediction block with an even number of columns according to the size parameters of at least one reference reconstruction block and / or reference prediction block, so as to generate a corresponding derivative block, the filled pixels can be considered as a whole when performing transformation processing on the derivative block in the future, thereby improving the prediction effect.
[0295] Step b2 involves filling the top or bottom of at least one cropped reference reconstruction block and / or reference prediction block with an even number of rows of pixels.
[0296] Optionally, the number of fill rows for pixel fill is equal to the transform height.
[0297] Optionally, if at least one row at the top and / or at least one row at the bottom of at least one reference reconstruction block and / or reference prediction block is cropped, and / or the total number of cropped rows is even, then the top or bottom of the cropped at least one reference reconstruction block and / or reference prediction block can be padded with even-numbered rows of pixels, for example, with zero pixels, based on the size parameters of the at least one reference reconstruction block and / or reference prediction block, so that the size parameters of the padded reference prediction block are the same as the size parameters of the at least one reference prediction block, and the size parameters of the padded reference reconstruction block are the same as the size parameters of the at least one reference reconstruction block.
[0298] For example, if at least the top row and at least the bottom row of the reference reconstruction block and / or the reference prediction block are cropped, then the rightmost column of the cropped reference reconstruction block and / or the reference prediction block can be filled with two columns of pixels, and / or the filling method can be zero-filling or mirror-symmetric filling.
[0299] Optionally, based on the size parameters of at least one reference prediction block, the top or bottom of the cropped reference prediction block is filled with an even number of rows of pixels to obtain a filled reference prediction block. A derived block is determined or obtained based on the filled reference prediction block, and prediction processing is performed based on at least one derived block.
[0300] Optionally, based on the size parameters of at least one reference reconstruction block, the top or bottom of the cropped reference reconstruction block is filled with an even number of rows of pixels to obtain a filled reference reconstruction block. A derivative block is determined or obtained based on the filled image block, and then prediction processing is performed based on at least one derivative block.
[0301] For example, such as Figure 8 As shown, the boundary region of the reconstructed block of at least one image block can be cropped, and two rows of zero padding can be performed at the bottom to obtain the first vertical derived block. Figure 8 In the image block, the region with a pixel value of 0 is the region after zero padding, and the region with a non-zero pixel value (such as 2, 63, etc.) is the region in the reconstructed block that has not been cropped and padded.
[0302] For example, such as Figure 9 As shown, the boundary region of the reconstructed block of at least one image block can be cropped, and two columns of mirror filling can be performed on the far right and two rows of mirror filling can be performed on the far bottom to obtain the first horizontal and vertical hybrid derivative block. Figure 9 The white area represents the area after mirror filling, while the gray area represents the area without cropping and filling.
[0303] In this embodiment, by filling the top or bottom of the cropped reference reconstruction block and / or reference prediction block with an even number of rows of pixels according to the size parameters of at least one reference reconstruction block and / or reference prediction block, a corresponding derivative block can be generated. This allows the filled pixels to be considered as a whole when performing transformation processing on the derivative block, thereby improving the prediction effect.
[0304] Step S12: Determine or generate a prediction block based on the neural network and / or lookup table, and at least one derived block.
[0305] Optionally, prediction processing is performed based on a neural network and / or a lookup table, as well as a reference prediction block and / or a derivative block corresponding to a reference reconstruction block of at least one reference block.
[0306] Optionally, at least one index can be determined or obtained by utilizing at least one of the following: a derived block of at least one reference prediction block, at least one reference reconstruction block, and a derived block of at least one reference reconstruction block. Then, a lookup table is performed based on the at least one index, and the prediction block after prediction processing is determined or obtained based on the lookup result.
[0307] Optionally, at least one of the following can be input into a neural network for model training: at least one reference prediction block, at least one reference reconstruction block, and at least one reference reconstruction block. The prediction block after prediction processing can be determined or obtained based on the output result.
[0308] Optionally, transformation processing can be performed on at least one of the following: at least one reference prediction block, at least one reference reconstruction block, and at least one reference reconstruction block, to obtain the transformation features corresponding to each block. The transformation features corresponding to each block are then input into the neural network, and the prediction block after prediction processing is determined or obtained based on the output results. For example, the prediction block can be directly output, or a prediction mode can be output, and prediction processing can be performed based on the prediction mode to obtain the prediction block.
[0309] Optionally, it may also involve determining at least one of the following transformation features: at least one reference prediction block, at least one reference reconstruction block, and at least one reference reconstruction block; and determining the index corresponding to each transformation feature so as to perform a lookup in at least one lookup table based on the index, so as to determine or obtain the prediction block after prediction processing based on the lookup result. For example, if the lookup result is a prediction mode, prediction processing is performed based on the prediction mode to obtain the prediction block, or if the lookup result is a prediction pixel after prediction processing, the prediction block is determined or obtained based on the prediction pixel.
[0310] Optionally, in this embodiment, the lookup table structure is used to approximate the neural network, and the channels in the lookup table structure have the same meaning as the channels in the original neural network.
[0311] Optionally, in this embodiment, the neural network can be a neural network containing complete input and output, or it can be a neural network module containing only a part of the neural network. For example, a neural network module with only one convolutional layer is also a neural network.
[0312] In this embodiment, by determining or obtaining a prediction block based on a neural network and / or a lookup table, and at least one derived block, the advantages of the neural network and / or lookup table can be combined, and the transition features between image blocks can be effectively captured through the derived block, thereby improving the prediction effect of the prediction processing.
[0313] Fourth embodiment
[0314] Based on any of the above embodiments, a fourth embodiment is proposed.
[0315] In this embodiment, the image processing method further includes at least one of the following steps c1 to c4:
[0316] Step c1: Transform at least one of the first horizontal derived block, the first vertical derived block, and the first horizontal-vertical hybrid derived block to obtain the first transformation feature;
[0317] Optionally, in terms of the luminance component, at least one derived block corresponding to the reference prediction block of at least one reference block can be transformed, or at least one derived block corresponding to the reference reconstruction block of at least one reference block can be transformed.
[0318] Optionally, the transformation method can be at least one of DCT transform, DST transform, KL transform, wavelet transform, Hadamard transform, etc., such as using 2x2 DCT transform for transformation processing.
[0319] Optionally, the luminance component derived blocks of at least one reference prediction block can be transformed to determine or obtain a first transformation feature of the reference prediction block in the luminance component. The first transformation feature can be the pixel features of the block after the luminance component derived blocks of the reference prediction block have been transformed, such as pixel value, pixel position, etc.
[0320] Optionally, the luminance component derived blocks of at least one reference reconstruction block can be transformed to determine or obtain a first transformation feature of the reference reconstruction block in the luminance component. The first transformation feature can be the pixel features of the block after the luminance component derived blocks of the reference reconstruction block have been transformed, such as pixel value, pixel position, etc.
[0321] Optionally, within the luminance component, at least one of the first horizontal derived block, the first vertical derived block, and the first horizontal-vertical mixed derived block corresponding to the reference prediction block and / or the reference reconstruction block in the horizontal direction, the vertical direction, and the horizontal-vertical mixed direction can be transformed, and then the corresponding transformation features can be obtained and used as the first transformation features.
[0322] In this embodiment, by transforming at least one of the first horizontal derivative block, the first vertical derivative block, and the first horizontal-vertical hybrid derivative block in different directions on the luminance component, a first transformation feature is obtained, which facilitates the effective prediction processing of the subsequent neural network and / or lookup table based on the first transformation feature.
[0323] Step c2: Transform at least one of the second horizontal derived block, the second vertical derived block, and the second horizontal-vertical hybrid derived block to obtain the second transformation feature;
[0324] Optionally, in terms of chroma components, at least one derived block corresponding to the reference prediction block of at least one reference block can be transformed, or at least one derived block corresponding to the reference reconstruction block of at least one reference block can be transformed.
[0325] Optionally, the transformation method can be at least one of DCT transform, DST transform, KL transform, wavelet transform, Hadamard transform, etc., such as using 2x2 DCT transform for transformation processing.
[0326] Optionally, the chroma component derived blocks of at least one reference prediction block can be transformed to determine or obtain a second transformation feature of the reference prediction block in the chroma component. The second transformation feature can be the pixel features of the block after the chroma component derived blocks of the reference prediction block have been transformed, such as pixel value, pixel position, etc.
[0327] Optionally, the chroma component derived blocks of at least one reference reconstruction block can be transformed to determine or obtain a second transformation feature of the reference reconstruction block in the chroma component. The second transformation feature can be the pixel features of the block after the chroma component derived blocks of the reference reconstruction block have been transformed, such as pixel value, pixel position, etc.
[0328] Optionally, within the chroma component, at least one of the second horizontal derived block, the second vertical derived block, and the second horizontal-vertical mixed derived block corresponding to the reference prediction block and / or the reference reconstruction block in the horizontal direction, the vertical direction, and the horizontal-vertical mixed direction can be transformed, and then the corresponding transformation features can be obtained and used as the second transformation features.
[0329] In this embodiment, by transforming at least one of the second horizontal derivative block, the second vertical derivative block, and the second horizontal-vertical hybrid derivative block in different directions on the chromaticity component, a second transformation feature is obtained, which facilitates the effective prediction processing of the subsequent neural network and / or lookup table based on the second transformation feature.
[0330] Step c3: Transform at least one of the third horizontal derived block, the third vertical derived block, and the third horizontal-vertical hybrid derived block to obtain the third transformation feature;
[0331] Optionally, a transformation process can be performed on cross-component derived blocks of the reference reconstruction block and / or the reference prediction block.
[0332] Optionally, the transformation method can be at least one of DCT transform, DST transform, KL transform, wavelet transform, Hadamard transform, etc., such as using 2x2 DCT transform for transformation processing.
[0333] Optionally, a transformation process can be performed on the cross-component derived blocks of at least one reference prediction block. A third transformation feature can be determined based on the characteristics of the transformed block. For example, if it is necessary to obtain the third transformation feature of the reference prediction block in the luminance component, the chrominance component derived blocks obtained by the reference prediction block in the chrominance component can be determined, and the chrominance component derived blocks can be transformed to extract the corresponding transformation features, which can then be used as the third transformation feature of the reference prediction block in the luminance component.
[0334] Optionally, a transformation process can be performed on the cross-component derived blocks of at least one reference reconstruction block, and a third transformation feature can be determined based on the characteristics of the transformed block. The specific implementation process is similar to that of determining the third transformation feature of the reference prediction block.
[0335] Optionally, the third transformation feature can be the pixel features of the block after transformation across component-derived blocks of the reference prediction block, such as pixel value, pixel position, etc.
[0336] Optionally, a transformation process is performed on at least one of the third horizontal derived block, the third vertical derived block, and the third horizontal-vertical hybrid derived block corresponding to the reference prediction block and / or the reference reconstruction block in the horizontal direction, the vertical direction, and the horizontal-vertical hybrid direction, respectively, and then the corresponding transformation feature is obtained and used as the third transformation feature.
[0337] In this embodiment, by transforming at least one of the third horizontal derivative block, the third vertical derivative block, and the third horizontal-vertical hybrid derivative block in different directions across components, a third transformation feature is obtained, which facilitates the effective prediction processing of the subsequent neural network and / or lookup table based on the third transformation feature.
[0338] Step c4: Transform the reconstructed image information and / or predicted image information of at least one reference reconstruction block and / or reference prediction block to obtain the fourth transformation feature.
[0339] Optionally, the reconstructed image information of the reference reconstruction block can be pixel information, texture information, etc., such as pixel value, pixel position, etc.
[0340] Optionally, the predicted image information of the reference prediction block can be pixel information, texture information, etc., such as pixel value, pixel position, etc.
[0341] Optionally, the reconstructed image information of at least one reference reconstructed block can be transformed, and transformation features can be determined or obtained based on the transformed reconstructed image information, and used as the fourth transformation feature. For example, the pixel information in the transformed reconstructed image information can be used as the fourth transformation feature.
[0342] Optionally, the predicted image information of at least one reference prediction block can be transformed, and the transformation features can be determined or obtained based on the transformed predicted image information and used as the fourth transformation feature. For example, the pixel information in the transformed predicted image information can be used as the fourth transformation feature.
[0343] Optionally, after obtaining at least one reference reconstruction block, the predicted image information of at least one reference prediction block belonging to the same reference block can be obtained based on the at least one reference reconstruction block, and the predicted image information can be transformed. The transformed predicted image information can be used to determine or obtain transformation features and use them as the fourth transformation features, such as using the pixel information in the transformed predicted image information as the fourth transformation features.
[0344] Optionally, after obtaining at least one reference prediction block, the reconstructed image information of at least one reference reconstruction block belonging to the same reference block can be obtained based on the at least one reference prediction block, and the reconstructed image information can be transformed. The transformation feature can be determined or obtained based on the transformed reconstructed image information and used as the fourth transformation feature. For example, the pixel information in the transformed reconstructed image information can be used as the fourth transformation feature.
[0345] In this embodiment, by transforming the reconstructed image information and / or predicted image information of at least one reference reconstruction block and / or reference prediction block, a fourth transformation feature is obtained, which facilitates the effective prediction processing of the subsequent neural network and / or lookup table based on the third transformation feature.
[0346] Fifth Embodiment
[0347] Based on any of the above embodiments, a fifth embodiment is proposed.
[0348] In this embodiment, step S1 includes at least one of the following steps d1 to d4:
[0349] Step d1: Based on the neural network and / or lookup table, and the result of channel splicing of the first transformation feature and the fourth transformation feature, determine or obtain the first feature set, and determine or generate the prediction block based on the first feature set;
[0350] Optionally, the first transformation feature can be determined or obtained based on the result of transforming the luminance component derived block of the predicted block on the luminance component. Alternatively, the first transformation feature can be determined or obtained based on the result of transforming the luminance component derived block of the reconstructed block on the luminance component.
[0351] Optionally, the first transformation feature can be determined or obtained based on the result of transforming at least one of the first horizontal derived block, the first vertical derived block, and the first horizontal-vertical hybrid derived block on the luminance component of the predicted block and / or the reconstructed block.
[0352] Optionally, the reconstructed image information of at least one reconstructed block can be transformed to obtain the fourth transformation feature, and the predicted image information of at least one predicted block can be transformed to obtain the fourth transformation feature.
[0353] Optionally, the first feature set may include multiple multi-channel dimensional image patch features that have been stitched together, such as a first transformation feature stitched together and a fourth transformation feature stitched together.
[0354] Optionally, a channel is a component of the feature map of an image patch in the depth dimension, used to describe the feature representation of the number of features in a specific dimension. Each channel represents a certain feature (such as texture, edge, derived distribution, etc.) extracted from the image patch (such as a prediction patch or a reconstruction patch). For example, one channel may be used to detect horizontal edges, and another channel may be used to detect vertical edges, etc.
[0355] Optionally, image block features from at least two channels (such as the first transform feature and / or the fourth transform feature) can be concatenated to obtain the first multi-channel feature. Then, based on the pre-set correspondence between the channel features and the index, the index corresponding to the first multi-channel feature (such as a one-dimensional index, a two-dimensional index, or a three-dimensional index) can be determined or obtained, and input into a lookup table for searching to determine or obtain the predicted block.
[0356] Optionally, at least one first transformation feature and at least one fourth transformation feature can be concatenated using at least one convolutional layer in the neural network to obtain at least one multi-channel dimension image patch feature, such as the multi-channel dimension image patch feature corresponding to the predicted block or the multi-channel dimension image patch feature corresponding to the reconstructed block.
[0357] For example, at least one first transformation feature corresponding to the derived block of at least one reconstructed block and at least one fourth transformation feature corresponding to the reconstructed block are input into a neural network for grouped convolution processing. In each convolution group, the transformation features of multiple channels (such as the first transformation feature and / or the fourth transformation feature) are concatenated to obtain a first feature set containing at least one multi-channel dimension image block feature. Then, at least one image block feature in the first feature set is subjected to subsequent convolution processing until the output result of the neural network is obtained. Based on the output result, the prediction block is determined or obtained. For example, when the output result is a prediction mode, the prediction processing of the block to be predicted is performed according to the prediction mode to obtain the prediction block.
[0358] Optionally, at least one first transform feature and at least one fourth transform feature can be concatenated to obtain a first feature set containing at least one multi-channel dimension image patch feature. The index corresponding to at least one image patch feature in the first feature set can be determined, and the index can be used to search in at least one lookup table to determine or obtain the predicted patch.
[0359] In this embodiment, a first feature set is determined or obtained by channel splicing based on a neural network and / or a lookup table, as well as the first and fourth transform features. A prediction block is determined or generated based on the first feature set. This approach leverages the advantages of neural networks and / or lookup tables, and combines the transform features of the reference reconstructed block and its corresponding derived block on the luminance component with the transform features of the reference prediction block and its corresponding derived block on the luminance component for prediction processing, thereby improving the accuracy of the obtained prediction block.
[0360] Step d2: Based on the neural network and / or lookup table, and the channel splicing results of the second and fourth transformation features, determine or obtain the first feature set, and determine or generate the prediction block based on the first feature set;
[0361] Optionally, the second transformation feature can be determined or obtained based on the result of transforming the chromaticity component derived block of the predicted block on the chromaticity component, and the second transformation feature can be determined or obtained based on the result of transforming the luminance component derived block of the reconstructed block on the chromaticity component.
[0362] Optionally, the second transformation feature can be determined or obtained based on the result of transforming at least one of the second horizontal derived block, the second vertical derived block, and the second horizontal-vertical mixed derived block on the chromaticity component of the predicted block and / or the reconstructed block.
[0363] Optionally, the reconstructed image information of at least one reconstructed block can be transformed to obtain the fourth transformation feature, and the predicted image information of at least one predicted block can be transformed to obtain the fourth transformation feature.
[0364] Optionally, the first feature set may include multiple multi-channel dimensional image patch features that have been stitched together, such as second transformation features that have been stitched together and fourth transformation features that have been stitched together.
[0365] Optionally, at least one second transformation feature and at least one fourth transformation feature can be concatenated using at least one convolutional layer in the neural network to obtain at least one multi-channel dimension image patch feature, such as the multi-channel dimension image patch feature corresponding to the predicted block or the multi-channel dimension image patch feature corresponding to the reconstructed block.
[0366] For example, at least one second transformation feature corresponding to the derived block of at least one reconstructed block and at least one fourth transformation feature corresponding to the reconstructed block are input into a neural network for grouped convolution processing. In each convolution group, the transformation features of multiple channels (such as the second transformation feature and / or the fourth transformation feature) are concatenated to obtain a first feature set containing at least one multi-channel dimension image block feature. Then, at least one image block feature in the first feature set is subjected to subsequent convolution processing until the output result of the neural network is obtained. The prediction block is determined or obtained based on the output result. For example, when the output result is a prediction mode, the prediction processing of the block to be predicted is performed according to the prediction mode to obtain the prediction block.
[0367] Optionally, at least one second transformation feature and at least one fourth transformation feature can be concatenated to obtain a first feature set containing at least one multi-channel dimension image patch feature. The index corresponding to at least one image patch feature in the first feature set can be determined, and the index can be used to search in at least one lookup table to determine or obtain the predicted patch.
[0368] In this embodiment, a first feature set is determined or obtained by channel splicing based on a neural network and / or a lookup table, as well as the second and fourth transform features. A prediction block is determined or generated based on the first feature set. This approach leverages the advantages of neural networks and / or lookup tables, and combines the transform features of the reference reconstructed block and its corresponding derived block on the chromaticity component with the transform features of the reference prediction block and its corresponding derived block on the chromaticity component for prediction processing, thereby improving the accuracy of the obtained prediction block.
[0369] Step d3: Based on the neural network and / or lookup table, and the channel splicing results of the first transformation feature, the second transformation feature and the fourth transformation feature, determine or obtain the first feature set, and determine or generate the prediction block based on the first feature set;
[0370] Optionally, the first transformation feature can be determined or obtained based on the result of transforming the luminance component derived block of the predicted block on the luminance component. Alternatively, the first transformation feature can be determined or obtained based on the result of transforming the luminance component derived block of the reconstructed block on the luminance component.
[0371] Optionally, the first transformation feature can be determined or obtained based on the result of transforming at least one of the first horizontal derived block, the first vertical derived block, and the first horizontal-vertical hybrid derived block on the luminance component of the predicted block and / or the reconstructed block.
[0372] Optionally, the second transformation feature can be determined or obtained based on the result of transforming the chromaticity component derived block of the predicted block on the chromaticity component, and the second transformation feature can be determined or obtained based on the result of transforming the luminance component derived block of the reconstructed block on the chromaticity component.
[0373] Optionally, the second transformation feature can be determined or obtained based on the result of transforming at least one of the second horizontal derived block, the second vertical derived block, and the second horizontal-vertical mixed derived block on the chromaticity component of the predicted block and / or the reconstructed block.
[0374] Optionally, the reconstructed image information of at least one reconstructed block can be transformed to obtain the fourth transformation feature, and the predicted image information of at least one predicted block can be transformed to obtain the fourth transformation feature.
[0375] Optionally, the first feature set may include multiple multi-channel dimensional image patch features that have been stitched together, such as a first transformation feature stitched together, a second transformation feature stitched together, and a fourth transformation feature stitched together.
[0376] Optionally, at least one first transformation feature, at least one second transformation feature, and at least one fourth transformation feature can be concatenated using at least one convolutional layer in the neural network to obtain at least one multi-channel dimensional image patch feature, such as the multi-channel dimensional image patch feature corresponding to the predicted block or the multi-channel dimensional image patch feature corresponding to the reconstructed block.
[0377] For example, at least one first transformation feature and at least one second transformation feature corresponding to the derived block of at least one reconstructed block, and at least one fourth transformation feature corresponding to the reconstructed block are input into a neural network for grouped convolution processing. In each convolution group, the transformation features of multiple channels (such as the first transformation feature, the second transformation feature, and / or the fourth transformation feature) are concatenated to obtain a first feature set containing at least one multi-channel dimension image block feature. Then, at least one image block feature in the first feature set is subjected to subsequent convolution processing until the output result of the neural network is obtained. Based on the output result, the prediction block is determined or obtained. For example, when the output result is a prediction mode, the prediction processing of the block to be predicted is performed according to the prediction mode to obtain the prediction block.
[0378] Optionally, at least one first transformation feature, at least one second transformation feature, and at least one fourth transformation feature can be concatenated to obtain a first feature set containing at least one multi-channel dimension image patch feature. The index corresponding to at least one image patch feature in the first feature set can be determined, and the index can be used to search in at least one lookup table to determine or obtain the predicted patch.
[0379] In this embodiment, a first feature set is determined or obtained by channel splicing based on a neural network and / or a lookup table, as well as the first, second, and fourth transformation features. A prediction block is determined or generated based on the first feature set. This approach leverages the advantages of neural networks and / or lookup tables, and combines the transformation features of the reference reconstructed block and its corresponding derived block on the luminance and chrominance components with the transformation features of the reference prediction block and its corresponding derived block on the luminance and chrominance components for prediction processing, thereby improving the accuracy of the obtained prediction block.
[0380] Step d4: Based on the neural network and / or lookup table, and the result of channel splicing of the third and fourth transformation features, determine or obtain the first feature set, and determine or generate the prediction block based on the first feature set.
[0381] Optionally, the third transformation feature can be determined or obtained based on the result of the transformation of the cross-component derived blocks of the predicted block, or the third transformation feature can be determined or obtained based on the result of the transformation of the cross-component derived blocks of the reconstructed block.
[0382] Optionally, a third transformation feature can be determined or obtained based on the result of transforming at least one of the third horizontal derived block, the third vertical derived block, and the third horizontal-vertical hybrid derived block across components of the predicted block and / or the reconstructed block.
[0383] Optionally, the reconstructed image information of at least one reconstructed block can be transformed to obtain the fourth transformation feature, and the predicted image information of at least one predicted block can be transformed to obtain the fourth transformation feature.
[0384] Optionally, the first feature set may include multiple multi-channel dimensional image patch features that have been stitched together, such as third transformation features that have been stitched together and fourth transformation features that have been stitched together.
[0385] Optionally, at least one third transformation feature and at least one fourth transformation feature can be concatenated using at least one convolutional layer in the neural network to obtain at least one multi-channel dimension image patch feature, such as the multi-channel dimension image patch feature corresponding to the predicted block or the multi-channel dimension image patch feature corresponding to the reconstructed block.
[0386] For example, at least one third transform feature corresponding to the derived block of at least one reconstructed block and at least one fourth transform feature corresponding to the reconstructed block are input into a neural network for grouped convolution processing. In each convolution group, the transform features of multiple channels (such as the third transform feature and / or the fourth transform feature) are concatenated to obtain a first feature set containing at least one multi-channel dimension image block feature. Then, at least one image block feature in the first feature set is subjected to subsequent convolution processing until the output result of the neural network is obtained. Based on the output result, the prediction block is determined or obtained. For example, when the output result is a prediction mode, the prediction processing of the block to be predicted is performed according to the prediction mode to obtain the prediction block.
[0387] Optionally, at least one third transform feature and at least one fourth transform feature can be concatenated to obtain a first feature set containing at least one multi-channel dimension image patch feature. The index corresponding to at least one image patch feature in the first feature set can be determined, and the index can be used to search in at least one lookup table to determine or obtain the predicted patch.
[0388] In this embodiment, a first feature set is determined or obtained by channel splicing based on a neural network and / or a lookup table, as well as the third and fourth transform features. A prediction block is determined or generated based on the first feature set. This approach leverages the advantages of neural networks and / or lookup tables, and combines the transform features of the reference reconstructed block and its corresponding derived block across components with the transform features of the reference prediction block and its corresponding derived block on the chromaticity component for prediction processing, thereby improving the accuracy of the obtained prediction block.
[0389] Optionally, determining or generating a prediction block based on the first feature set includes at least one of the following steps e1 to e2:
[0390] Step e1: Convolve some features in the first feature set according to the convolution module of the neural network to obtain the first convolution feature. Based on the fusion result of the first convolution feature and the unconvolutioned features in the first feature set, determine or generate the prediction block.
[0391] Optionally, the first feature set can be determined or obtained based on at least one of steps d1 to d4 above.
[0392] Optionally, some features in the first feature set may be at least one multi-channel dimensional image patch feature in the first feature set.
[0393] Optionally, the unconvolved features in the first feature set can be multi-channel dimensional image patch features in the first feature set that have not participated in convolution processing.
[0394] Optionally, the neural network may include a first branch containing a short-circuit structure and a second branch containing at least one convolutional module. Some features in the first feature set can pass through the second branch and be convolved by the convolutional module (such as a 3x3 convolutional layer) in the second branch to obtain the first convolutional features. The remaining features in the first feature set pass through the first branch to obtain the unconvolutional features in the first feature set. The first convolutional features output by the second branch and the unconvolutional features output by the first branch are then fused (such as channel splicing) to obtain the fusion result. The fusion result can be subjected to subsequent convolutional processing to determine or generate the output result of the neural network. Based on the output result of the neural network, a prediction block is determined or generated. For example, the output result is a prediction block or a prediction pattern. Based on the prediction pattern, the prediction block is processed to determine or generate the prediction block.
[0395] In this embodiment, a first convolutional feature is obtained by convolving a portion of the features in the first feature set using the convolutional module of the neural network. The prediction block is determined or generated by fusing the first convolutional feature with the unconvolutional features in the first feature set. This approach utilizes the partial convolutional characteristics of the neural network to reduce complexity while ensuring prediction performance.
[0396] Step e2: Search for some features in the first feature set according to the lookup table to obtain the first lookup table features. Based on the first lookup table features and the features not found in the first feature set, perform fusion to determine or generate the prediction block.
[0397] Optionally, the first feature set can be determined or obtained based on at least one of steps d1 to d4 above.
[0398] Optionally, some features in the first feature set may be at least one multi-channel dimension image patch feature in the first feature set. Optionally, features not searched in the first feature set may be multi-channel dimension image patch features in the first feature set that did not participate in the lookup table lookup operation.
[0399] Optionally, the correspondence between image patch features and indices can be pre-set, and the indices corresponding to some features in the first feature set can be determined based on the correspondence to perform a lookup in the lookup table and obtain the lookup results. The lookup results include the first lookup table features, such as the prediction mode corresponding to the index, the prediction pixels, etc.
[0400] Optionally, the first lookup table features and the features not found in the first feature set can be fused to obtain a fusion result. Then, based on the fusion result and the subsequent neural network and / or lookup table, a prediction block can be determined or generated.
[0401] Optionally, when the fusion result is predicted pixels and unfounded features, the model can be trained on the unfounded features in the fusion result based on the neural network to determine the predicted pixels corresponding to the unfounded features. Alternatively, the predicted pixels corresponding to the unfounded features can be determined or obtained based on a lookup table, and then the predicted block can be obtained based on all the predicted pixels corresponding to the block to be predicted.
[0402] Optionally, when the fusion result is a predicted mode and unfound features, the predicted mode corresponding to the unfound features can be determined based on the lookup table and / or neural network, and the selected predicted mode (such as the prediction mode with the lowest cost) can be determined among the various predicted modes. Then, the predicted block to be predicted is processed according to the selected predicted mode to obtain the predicted block.
[0403] In this embodiment, by searching for some features in the first feature set according to the lookup table, the first lookup table features are obtained. The fusion result of fusing the first lookup table features and the features not found in the first feature set is used to determine or generate a prediction block. In this way, the characteristics of the lookup table can be used to improve the prediction effect.
[0404] Sixth Embodiment
[0405] This application also provides a processing apparatus, referring to... Figure 10 The processing device includes:
[0406] Processing module A10 is used to determine or generate a prediction block based on the derived block corresponding to the reference block of at least one prediction block.
[0407] Optionally, the reference block of at least one block to be predicted includes a reference prediction block and / or a reference reconstruction block of the reference block; and / or, the reference block is determined or obtained according to at least one of the following:
[0408] The pixel to be predicted is at least one of the following: the pixel above, the non-adjacent pixel above, the pixel to the left, the non-adjacent pixel to the left, the pixel above the left, the non-adjacent pixel to the left, the pixel below the left, the non-adjacent pixel to the left, the pixel above the right, and the non-adjacent pixel to the right.
[0409] The block to be predicted is at least one of the following: neighboring block, non-neighboring block, cross-component block, co-position block, temporal block, and default block;
[0410] The width, height, size, and area of the block to be predicted;
[0411] Candidate motion vectors or candidate block vectors of the block to be predicted are used to determine or generate candidate blocks;
[0412] If the first information of the block to be predicted satisfies the first condition, then the reference block is the first reference block;
[0413] If the first information of the block to be predicted satisfies the second condition, then the reference block is the second reference block.
[0414] Optionally, processing module A10 is configured to perform at least one of the following:
[0415] The derived block includes at least one of the following: luminance component derived block, chrominance component derived block, and cross-component derived block;
[0416] The luminance component derivation block includes at least one of a first horizontal derivation block, a first vertical derivation block, and a first horizontal-vertical mixed derivation block;
[0417] The chromaticity component derivation block includes at least one of a second horizontal derivation block, a second vertical derivation block, and a second horizontal-vertical mixed derivation block;
[0418] The cross-component derivative block includes at least one of the third horizontal derivative block, the third vertical derivative block, and the third horizontal-vertical hybrid derivative block.
[0419] Optionally, processing module A10 is used to perform:
[0420] Based on the reference reconstruction block and / or reference prediction block of at least one reference block, at least one derived block is determined or obtained;
[0421] Based on a neural network and / or a lookup table, and at least one derived block, a prediction block is determined or generated.
[0422] Optionally, the size parameters of the derived block are matched with the size parameters of the reference reconstructed block and / or the reference predicted block.
[0423] Processing module A10 is used to execute:
[0424] Crop a portion of at least one reference reconstruction block and / or a reference prediction block;
[0425] Based on the filling results of filling at least one reference reconstruction block and / or reference prediction block after trimming, at least one derived block is determined or obtained.
[0426] Optionally, at least one region of a reference reconstruction block and / or a reference prediction block is determined or obtained based on at least one of the following:
[0427] At least one of the following: at least one leftmost column, at least one rightmost column, at least one topmost row, and at least one bottommost row of a reference reconstruction block and / or a reference prediction block;
[0428] Transform dimensional parameters;
[0429] Cutting dimension parameters;
[0430] The sliding area size and / or the preset sliding step size of the sliding window on at least one reconstruction block and / or prediction block.
[0431] Optionally, processing module A10 is configured to perform at least one of the following:
[0432] The cutting dimension parameter is less than or equal to the transformation dimension parameter;
[0433] The transformation dimension parameters include at least one of the transformation width and transformation height;
[0434] The transformation width is greater than or equal to the number of columns that are cropped and / or filled in a portion of at least one reference reconstruction block and / or reference prediction block;
[0435] The transformation height is greater than or equal to the number of rows in which a portion of at least one reference reconstruction block and / or reference prediction block is clipped and / or filled.
[0436] Optionally, the padding method for at least one of the clipped reference reconstruction blocks and / or reference prediction blocks includes at least one of the following:
[0437] Fill the leftmost or rightmost column of pixels with an even number of pixels for at least one reference reconstruction block and / or reference prediction block after cropping;
[0438] Fill the top or bottom of at least one cropped reference reconstruction block and / or reference prediction block with an even number of rows of pixels.
[0439] Optionally, processing module A10 is configured to perform at least one of the following:
[0440] Transform at least one of the first horizontal derived block, the first vertical derived block, and the first horizontal-vertical hybrid derived block to obtain the first transformation feature;
[0441] Transform at least one of the second horizontal derived block, the second vertical derived block, and the second horizontal-vertical hybrid derived block to obtain the second transformation feature;
[0442] Transform at least one of the third horizontal derived block, the third vertical derived block, and the third horizontal-vertical hybrid derived block to obtain the third transformation feature;
[0443] The reconstructed image information and / or predicted image information of at least one reconstructed block and / or predicted block are transformed to obtain the fourth transformation feature.
[0444] Optionally, processing module A10 is configured to perform at least one of the following:
[0445] Based on the neural network and / or lookup table, and the result of channel splicing of the first transformation feature and the fourth transformation feature, a first feature set is determined or obtained, and a prediction block is determined or generated based on the first feature set;
[0446] Based on the neural network and / or lookup table, and the results of channel splicing of the second and fourth transformation features, a first feature set is determined or obtained, and a prediction block is determined or generated based on the first feature set.
[0447] Based on the neural network and / or lookup table, and the result of channel splicing of the first transformation feature, the second transformation feature and the fourth transformation feature, the first feature set is determined or obtained, and the prediction block is determined or generated based on the first feature set;
[0448] Based on the neural network and / or lookup table, as well as the channel concatenation results of the third and fourth transformation features, a first feature set is determined or obtained, and a prediction block is determined or generated based on the first feature set.
[0449] Optionally, the prediction block is determined or generated based on the first feature set, including at least one of the following:
[0450] The convolution module of the neural network performs convolution on some features in the first feature set to obtain the first convolution feature. The first convolution feature and the non-convolutioned features in the first feature set are fused together to determine or generate the prediction block.
[0451] The first lookup table is used to search for some features in the first feature set to obtain the first lookup table features. The first lookup table features and the features not found in the first feature set are fused together to determine or generate the prediction block.
[0452] The processing device provided in this application embodiment is similar in implementation principle and beneficial effect to the technical solution shown in the corresponding method embodiment above, and will not be described again here.
[0453] This application also provides a processing device, including a memory and a processor. The memory stores an image processing program, and when the image processing program is executed by the processor, it implements the steps of the image processing method in any of the above embodiments.
[0454] This application also provides a storage medium storing an image processing program, which, when executed by a processor, implements the steps of the image processing method in any of the above embodiments.
[0455] In the embodiments of the processing device and storage medium provided in this application, all the technical features of any of the above-described image processing method embodiments may be included. The extended and explanatory content of the specification is basically the same as that of the embodiments of the above methods, and will not be repeated here.
[0456] This application also provides a computer program product, which includes computer program code. When the computer program code is run on a computer, it causes the computer to perform the methods described in the various possible implementations above.
[0457] This application also provides a chip, including a memory and a processor. The memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that a device with the chip installed performs the methods described in the various possible implementations above.
[0458] It is understood that the above scenarios are merely examples and do not constitute a limitation on the application scenarios of the technical solutions provided in the embodiments of this application. The technical solutions of this application can also be applied to other scenarios. For example, as those skilled in the art will know, with the evolution of system architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of this application are also applicable to similar technical problems.
[0459] The sequence numbers of the embodiments in this application are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.
[0460] The steps in the method of this application embodiment can be adjusted, combined, or deleted according to actual needs.
[0461] The units in the device of this application embodiment can be merged, divided, and deleted according to actual needs.
[0462] In this application, the same or similar terms, concepts, technical solutions and / or application scenario descriptions are generally described in detail only when they appear for the first time. When they appear again, they are generally not repeated for the sake of brevity. When understanding the technical solutions and other contents of this application, the same or similar terms, concepts, technical solutions and / or application scenario descriptions that are not described in detail later can be referred to their previous relevant detailed descriptions.
[0463] In this application, the descriptions of the various embodiments have different focuses. For parts that are not described in detail or recorded in a certain embodiment, please refer to the relevant descriptions of other embodiments.
[0464] The technical features of the present application can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of the present application.
[0465] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) as described above, and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, controlled terminal, or network device, etc.) to execute the methods of each embodiment of this application.
[0466] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented, in whole or in part, as a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of this application is generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a storage medium or transmitted from one storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.) means. The storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, storage disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., a solid-state disk (SSD)).
[0467] The above are merely preferred embodiments of this application and do not limit the patent scope of this application. Any equivalent structural or procedural transformations made using the content of this application's specification and drawings, or direct or indirect applications in other related technical fields, are similarly included within the patent protection scope of this application.
Claims
1. An image processing method, characterized in that, Including the following steps: S1, determine or generate a prediction block based on the derived block corresponding to the reference block of at least one block to be predicted; Step S1 includes the following steps: S11, based on the reference reconstruction block and / or reference prediction block of at least one reference block, determine or obtain at least one derived block; wherein, step S11 includes: clipping the boundary portion region of at least one reference reconstruction block and / or reference prediction block; and determining or obtaining at least one derived block based on the filling result of filling the clipped at least one reference reconstruction block and / or reference prediction block. S12, determine or generate a prediction block based on the neural network and / or lookup table, and at least one derived block.
2. The image processing method as described in claim 1, characterized in that, At least one reference block to be predicted includes a reference prediction block and / or a reference reconstruction block of the reference block; and / or, the reference block is determined or obtained according to at least one of the following: The pixel to be predicted is at least one of the following: the pixel above, the non-adjacent pixel above, the pixel to the left, the non-adjacent pixel to the left, the pixel above the left, the non-adjacent pixel to the left, the pixel below the left, the non-adjacent pixel to the left, the pixel above the right, and the non-adjacent pixel to the right. The block to be predicted is at least one of the following: neighboring block, non-neighboring block, cross-component block, co-position block, temporal block, and default block; The width, height, size, and area of the block to be predicted; Candidate motion vectors or candidate block vectors of the block to be predicted are used to determine or generate candidate blocks; If the first information of the block to be predicted satisfies the first condition, then the reference block is the first reference block; if the first information of the block to be predicted satisfies the second condition, then the reference block is the second reference block. The first and second conditions are conditions set based on the first information. The first information includes at least one of the following: the block to be predicted's upper adjacent pixel, upper non-adjacent pixel, left adjacent pixel, left non-adjacent pixel, upper left adjacent pixel, upper left non-adjacent pixel, lower left adjacent pixel, lower left non-adjacent pixel, upper right adjacent pixel, and upper right non-adjacent pixel; the block to be predicted's corresponding neighboring block, non-neighboring block, and spanning block. At least one of the following: block, co-occurrence block, temporal block, and default block; at least one of the following: width, height, block size, and block area of the block to be predicted; candidate motion vector or candidate block vector of the block to be predicted, which is determined or generated as a candidate block; the second condition is different from the first condition, and the second reference block is different from the first reference block. The first reference block is at least one of the following: neighboring block, non-neighboring block, cross-component block, co-occurrence block, temporal block, candidate block, and default block corresponding to the block to be predicted. The second reference block is a block other than the first reference block among at least one of the following: neighboring block, non-neighboring block, cross-component block, co-occurrence block, temporal block, candidate block, and default block corresponding to the block to be predicted.
3. The image processing method as described in claim 1, characterized in that, It also includes at least one of the following: The derived block includes at least one of the following: luminance component derived block, chrominance component derived block, and cross-component derived block; The luminance component derivation block includes at least one of a first horizontal derivation block, a first vertical derivation block, and a first horizontal-vertical mixed derivation block; The chromaticity component derivation block includes at least one of a second horizontal derivation block, a second vertical derivation block, and a second horizontal-vertical mixed derivation block; The cross-component derivative block includes at least one of the third horizontal derivative block, the third vertical derivative block, and the third horizontal-vertical hybrid derivative block.
4. The image processing method as described in claim 1, characterized in that, The size parameters of the derived block are matched with the size parameters of the reference reconstructed block and / or the reference prediction block.
5. The image processing method as described in claim 4, characterized in that, At least one reference reconstruction block and / or a portion of a reference prediction block are determined or obtained based on at least one of the following: At least one of the following: at least one leftmost column, at least one rightmost column, at least one topmost row, and at least one bottommost row of a reference reconstruction block and / or a reference prediction block; Transform dimensional parameters; Cutting dimension parameters; The sliding area size and / or the preset sliding step size of the sliding window on at least one reconstruction block and / or prediction block.
6. The image processing method as described in claim 5, characterized in that, It also includes at least one of the following: The cutting dimension parameter is less than or equal to the transformation dimension parameter; The transformation dimension parameters include at least one of the transformation width and transformation height; The transformation width is greater than or equal to the number of columns that are cropped and / or filled in a portion of at least one reference reconstruction block and / or reference prediction block; The transformation height is greater than or equal to the number of rows in which a portion of at least one reference reconstruction block and / or reference prediction block is clipped and / or filled.
7. The image processing method as described in claim 1, characterized in that, The filling method for filling at least one reference reconstruction block and / or reference prediction block after clipping includes at least one of the following: Fill the leftmost or rightmost column of pixels with an even number of pixels for at least one reference reconstruction block and / or reference prediction block after cropping; Fill the top or bottom of at least one cropped reference reconstruction block and / or reference prediction block with an even number of rows of pixels.
8. The image processing method as described in claim 1, characterized in that, It also includes at least one of the following: Transform at least one of the first horizontal derived block, the first vertical derived block, and the first horizontal-vertical hybrid derived block to obtain the first transformation feature; Transform at least one of the second horizontal derived block, the second vertical derived block, and the second horizontal-vertical hybrid derived block to obtain the second transformation feature; Transform at least one of the third horizontal derived block, the third vertical derived block, and the third horizontal-vertical hybrid derived block to obtain the third transformation feature; The reconstructed image information and / or predicted image information of at least one reconstructed block and / or predicted block are transformed to obtain the fourth transformation feature.
9. The image processing method as described in claim 8, characterized in that, Step S1 includes at least one of the following: Based on the neural network and / or lookup table, and the result of channel splicing of the first transformation feature and the fourth transformation feature, a first feature set is determined or obtained, and a prediction block is determined or generated based on the first feature set; Based on the neural network and / or lookup table, and the results of channel splicing of the second and fourth transformation features, a first feature set is determined or obtained, and a prediction block is determined or generated based on the first feature set. Based on the neural network and / or lookup table, and the result of channel splicing of the first transformation feature, the second transformation feature and the fourth transformation feature, the first feature set is determined or obtained, and the prediction block is determined or generated based on the first feature set; Based on the neural network and / or lookup table, and the results of channel splicing of the third and fourth transformation features, a first feature set is determined or obtained, and a prediction block is determined or generated based on the first feature set.
10. The image processing method as described in claim 9, characterized in that, Determining or generating prediction blocks based on a first feature set includes at least one of the following: The convolution module of the neural network performs convolution on some features in the first feature set to obtain the first convolution feature. The first convolution feature and the non-convolutioned features in the first feature set are fused together to determine or generate the prediction block. The first lookup table is used to search for some features in the first feature set to obtain the first lookup table features. The first lookup table features and the features not found in the first feature set are fused together to determine or generate the prediction block.
11. A processing apparatus, characterized in that, include: A memory and a processor, wherein the memory stores an image processing program, and the image processing program, when executed by the processor, implements the steps of the image processing method as described in any one of claims 1 to 10.
12. A storage medium, characterized in that, The storage medium stores a computer program, which, when executed by a processor, implements the steps of the image processing method as described in any one of claims 1 to 10.