Image processing method and device, electronic equipment and storage medium
By splitting an image into branch images and processing them separately, the problems of image accuracy and speed when taking pictures on terminal devices are solved, achieving low-power and high-efficiency image enhancement effects and improving image quality.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING XIAOMI MOBILE SOFTWARE CO LTD
- Filing Date
- 2021-12-28
- Publication Date
- 2026-06-12
AI Technical Summary
Terminal devices cannot simultaneously meet the requirements of image accuracy and processing speed when taking pictures. Due to the limitations of complex background environments and limited imaging sensors, image degradation phenomena such as noise, blurring, and low resolution occur. Furthermore, existing image enhancement algorithms increase time consumption and power consumption.
The image to be processed is split into multiple branch images, and each branch image is input into a corresponding image processing model for processing. Finally, the results of the branch images are fused together. The characteristics of different branch images are used to match their corresponding processing models, thereby reducing the amount of processing and power consumption.
Achieve efficient and high-precision image processing with low power consumption, eliminate degradation phenomena such as noise, blur, and low resolution, and improve image quality.
Smart Images

Figure CN116416168B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of image processing technology, and specifically to an image processing method, apparatus, electronic device, and storage medium. Background Technology
[0002] In recent years, the camera functions of terminal devices have been continuously developing, while users' demands for image quality have also been increasing. When taking photos with a terminal device, complex background environments and limited camera sensors can cause degradation phenomena such as noise, blurriness, and low resolution in the captured images, ultimately affecting the image quality. Therefore, before the image is displayed on the screen, it undergoes many related image enhancement algorithms to eliminate these degradation effects. However, limited by the computing power of terminal devices, these image enhancement algorithms increase the time consumption and power consumption of the terminal device. Therefore, terminal devices cannot simultaneously meet the requirements of image accuracy and processing speed when taking photos. Summary of the Invention
[0003] To overcome the problems existing in the related technologies, this disclosure provides an image processing method, apparatus, electronic device, and storage medium to solve the defects in the related technologies.
[0004] According to a first aspect of the present disclosure, an image processing method is provided, comprising:
[0005] Obtain the image to be processed;
[0006] The image to be processed is split into multiple branch images, wherein the image information of each branch image includes the image information represented by multiple consecutive bits in the image information of the image to be processed;
[0007] The multiple branch images are respectively input into the corresponding image processing model, and each image processing model outputs the processing result of the corresponding branch image;
[0008] The processing results of multiple branch images are fused together to obtain the processing result of the image to be processed.
[0009] In one embodiment, splitting the image to be processed into multiple branch images includes:
[0010] Convert the first floating-point number of the pixels in the image to be processed from decimal to N-ary;
[0011] Each branch image is truncated from the first floating-point number according to the bit range of each branch image, and the N-ary second floating-point number composed of the consecutive bits is converted into decimal;
[0012] The floating-point values of the pixels in the image to be processed are updated to the second floating-point value corresponding to the branch image to obtain the branch image.
[0013] In one embodiment, the union of the bit ranges of the plurality of branch images is the total number of bits of the image to be processed; and / or,
[0014] The sum of the widths of the bit ranges of the multiple branch images is greater than or equal to the bit width of the image to be processed.
[0015] In one embodiment, the image processing model is matched with a plurality of consecutive bits of the corresponding image to be processed.
[0016] In one embodiment, fusing the processing results of multiple branch images to obtain the processing result of the image to be processed includes:
[0017] The third floating-point number of the pixel of the processing result of each branch image is converted from decimal to N-ary;
[0018] According to the bit range of each branch image, the third floating-point number of the pixel of the processing result of each branch image is added to the corresponding bit of the fourth floating-point number;
[0019] The fourth floating-point number is converted from N-ary to decimal, and the floating-point number of the pixels of the image to be processed is updated to the fourth floating-point number to obtain the processing result of the image to be processed.
[0020] In one embodiment, it includes:
[0021] Obtain sample images and corresponding label images from the training set;
[0022] The sample image is split into multiple sample branch images, and the label image is split into multiple label branch images, wherein each sample branch image includes multiple consecutive bits of the sample image, and each label branch image includes multiple consecutive bits of the label image;
[0023] The sample branch images are respectively input into the corresponding image processing model, and each image processing model outputs the processing result of the corresponding sample branch image;
[0024] A first network loss value is determined based on the processing result of the sample branch image and the corresponding label branch image, and the network parameters of the corresponding image processing model are adjusted based on the first network loss value.
[0025] In one embodiment, it includes:
[0026] The processing results of multiple sample branch images are fused together to obtain the processing result of the sample image;
[0027] A second network loss value is determined based on the processing result of the sample image and the corresponding label image, and the network parameters of each image processing model are adjusted based on the second network loss value.
[0028] In one embodiment, it includes:
[0029] The corresponding image processing model is quantized based on each sample branch image.
[0030] According to a second aspect of the present disclosure, an image processing apparatus is provided, comprising:
[0031] The acquisition module is used to acquire the image to be processed;
[0032] The splitting module is used to split the image to be processed into multiple branch images, wherein the image information of the branch images includes image information represented by multiple consecutive bits in the image information of the image to be processed;
[0033] The processing module is used to input the multiple branch images into the corresponding image processing models respectively, and each image processing model outputs the processing result of the corresponding branch image;
[0034] The fusion module is used to fuse the processing results of multiple branch images to obtain the processing result of the image to be processed.
[0035] In one embodiment, the splitting module is used for:
[0036] Convert the first floating-point number of the pixels in the image to be processed from decimal to N-ary;
[0037] Each branch image is truncated from the first floating-point number according to the bit range of each branch image, and the N-ary second floating-point number composed of the consecutive bits is converted into decimal;
[0038] The floating-point values of the pixels in the image to be processed are updated to the second floating-point value corresponding to the branch image to obtain the branch image.
[0039] In one embodiment, the union of the bit ranges of the plurality of branch images is the total number of bits of the image to be processed, and the sum of the widths of the bit ranges of the plurality of branch images is greater than or equal to the bit width of the image to be processed.
[0040] In one embodiment, the image processing model is matched with a plurality of consecutive bits of the corresponding image to be processed.
[0041] In one embodiment, the fusion module is used for:
[0042] The third floating-point number of the pixel of the processing result of each branch image is converted from decimal to N-ary;
[0043] According to the bit range of each branch image, the third floating-point number of the pixel of the processing result of each branch image is added to the corresponding bit of the fourth floating-point number;
[0044] The fourth floating-point number is converted from N-ary to decimal, and the floating-point number of the pixels of the image to be processed is updated to the fourth floating-point number to obtain the processing result of the image to be processed.
[0045] In one embodiment, a first training module is included, for:
[0046] Obtain sample images and corresponding label images from the training set;
[0047] The sample image is split into multiple sample branch images, and the label image is split into multiple label branch images, wherein each sample branch image includes multiple consecutive bits of the sample image, and each label branch image includes multiple consecutive bits of the label image;
[0048] The sample branch images are respectively input into the corresponding image processing model, and each image processing model outputs the processing result of the corresponding sample branch image;
[0049] A first network loss value is determined based on the processing result of the sample branch image and the corresponding label branch image, and the network parameters of the corresponding image processing model are adjusted based on the first network loss value.
[0050] In one embodiment, a second training module is included, for:
[0051] The processing results of multiple sample branch images are fused together to obtain the processing result of the sample image;
[0052] A second network loss value is determined based on the processing result of the sample image and the corresponding label image, and the network parameters of each image processing model are adjusted based on the second network loss value.
[0053] In one embodiment, a quantization module is included, for:
[0054] The corresponding image processing model is quantized based on each sample branch image.
[0055] According to a third aspect of the present disclosure, an electronic device is provided, the electronic device including a memory and a processor, the memory being configured to store computer instructions executable on the processor, and the processor being configured to execute the computer instructions based on the image processing method described in the first aspect.
[0056] According to a sixth aspect of the present disclosure, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the method described in the first aspect.
[0057] The technical solutions provided by the embodiments of this disclosure may include the following beneficial effects:
[0058] This disclosure acquires an image to be processed, splits it into multiple branch images, inputs each branch image into a corresponding image processing model, and each image processing model outputs a processing result for its corresponding branch image. Finally, the processing results of the multiple branch images are fused to obtain the final processing result for the image to be processed. Since the image information of each branch image includes image information represented by multiple consecutive bits from the image information of the image to be processed, each branch image can carry some characteristics of the image to be processed. Because each branch image is processed by a different image processing model, the processing content and intensity of the image processing module can be adapted to the image characteristics carried by its corresponding branch image, thereby ensuring the accuracy of image processing. Furthermore, compared to related technologies that perform all processing on the image to be processed, the image processing module in this application targets a smaller number of bits in the branch image and processes only a portion of the content, thereby reducing the power consumption of image processing and improving its efficiency. In other words, the image processing method provided in this application can perform image processing efficiently and with high precision with low power consumption. If applied to the shooting scenario of terminal devices, it can eliminate image degradation phenomena such as noise, blur, and low resolution, and improve image quality. That is, the terminal device can simultaneously meet the requirements of image accuracy and processing speed when taking pictures. Attached Figure Description
[0059] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with the invention and, together with the description, serve to explain the principles of the invention.
[0060] Figure 1 This is a flowchart illustrating an exemplary embodiment of the image processing method disclosed herein;
[0061] Figure 2 This is a schematic diagram illustrating an image segmentation process according to an exemplary embodiment of this disclosure;
[0062] Figure 3This is a flowchart illustrating an exemplary embodiment of the image enhancement process disclosed herein;
[0063] Figure 4 This is a schematic diagram of the structure of an image processing apparatus shown in an exemplary embodiment of the present disclosure;
[0064] Figure 5 This is a structural block diagram of an electronic device illustrated in an exemplary embodiment of the present disclosure. Detailed Implementation
[0065] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numerals in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this disclosure. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this disclosure as detailed in the appended claims.
[0066] The terminology used in this disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The singular forms “a,” “the,” and “the” as used in this disclosure and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should be understood that the term “and / or” as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.
[0067] It should be understood that although the terms first, second, third, etc., may be used in this disclosure to describe various information, such information should not be limited to these terms. These terms are used only to distinguish information of the same type from one another. For example, without departing from the scope of this disclosure, first information may also be referred to as second information, and similarly, second information may also be referred to as first information. Depending on the context, the word "if" as used herein may be interpreted as "when," "when," or "in response to determination."
[0068] In recent years, the camera functions of terminal devices have been continuously developing, while users' demands for image quality have also been increasing. When taking photos with a terminal device, complex background environments and limited camera sensors can cause degradation phenomena such as noise, blurriness, and low resolution in the captured images, ultimately affecting the image quality. Therefore, before the image is displayed on the screen, it undergoes many related image enhancement algorithms to eliminate these degradation effects. However, limited by the computing power of terminal devices, these image enhancement algorithms increase the time consumption and power consumption of the terminal device. Therefore, terminal devices cannot simultaneously meet the requirements of image accuracy and processing speed when taking photos.
[0069] For example, the most common solution to the problems of slow processing and high power consumption in terminal devices is model acceleration, such as quantization, pruning, and distillation. Among these, quantization is widely used due to its speed and ease of operation. Model quantization is a technique that converts floating-point calculations into low-ratio, specific-point calculations, which can effectively reduce the computational intensity, parameter size, and memory consumption of the model. However, model quantization is an approximation algorithm, and accuracy loss is a serious problem. Model quantization is often used in tasks such as detection, segmentation, and recognition because these tasks only need to output semantic information such as masks and confidence scores, and do not require high precision. However, for image enhancement tasks, the algorithm outputs a complete image, and even pixel-level errors can lead to visual discomfort. Therefore, while current model quantization processing can solve the problems of slow processing and high power consumption, it cannot meet the accuracy requirements of image processing.
[0070] Based on this, in a first aspect, at least one embodiment of this disclosure provides an image processing method, please refer to the appendix. Figure 1 The diagram illustrates the process of the method, including steps S101 and S104.
[0071] This image processing method can be used to enhance images, such as images taken by a terminal device, to eliminate degradation problems such as noise, blur, and low resolution.
[0072] This method can be executed by electronic devices such as terminal devices or servers. Terminal devices can be user equipment (UE), mobile devices, user terminals, terminals, cellular phones, cordless phones, personal digital assistant (PDA) handheld devices, computing devices, in-vehicle devices, wearable devices, etc. This method can be implemented by a processor calling computer-readable instructions stored in memory. Alternatively, this method can be executed by a server, such as a local server or a cloud server.
[0073] In step S101, the image to be processed is acquired.
[0074] The image to be processed can be an image captured by a terminal device but not displayed on the screen. In other words, the image to be processed needs to be processed by the method provided in this embodiment before it is displayed on the screen. The image to be processed can also be an image that needs to be processed in other applications or other scenarios.
[0075] In step S102, the image to be processed is split into multiple branch images, wherein the image information of the branch images includes the image information represented by multiple consecutive bits in the image information of the image to be processed.
[0076] The image to be processed can be 64-bit, 32-bit, 16-bit, etc. It can be split according to its bit count; for example, a 64-bit image can be split into four 16-bit branches, a 32-bit image into two 8-bit branches and one 16-bit branch, or a 16-bit image into two 8-bit branches. Adjacent branches can be separated by other bits, intersect, or partially overlap; preferably, adjacent branches intersect.
[0077] In one possible embodiment, the image to be processed can be split into multiple branch images as follows: First, the first floating-point number of the pixels of the image to be processed is converted from decimal to N-ary; next, a series of consecutive bits are extracted from the first floating-point number according to the bit range of each branch image, and the N-ary second floating-point number composed of the series of consecutive bits is converted to decimal; finally, the floating-point number of the pixels of the image to be processed is updated to the second floating-point number corresponding to the branch image to obtain the branch image. Here, N-ary is a non-decimal number, such as binary, octal, hexadecimal, etc. The bit range of each branch image is preset. Each pixel in the image to be processed is split according to the method described in the above embodiment, and each branch image is obtained by splitting in the above manner.
[0078] Preferably, the union of the bit ranges of the multiple branch images is the total number of bits of the image to be processed. In other words, the multiple branch images need to cover all the bits of the image to be processed, so that all the characteristics of the image to be processed can be transferred to the branch images, thereby avoiding the loss of information during the splitting of the image to be processed.
[0079] Preferably, the sum of the widths of the bit ranges of the multiple branch images is greater than or equal to the bit width of the image to be processed, that is, the bits extracted from the multiple branch images can intersect or overlap.
[0080] by Figure 2 Taking the splitting process shown as an example, the first floating-point number of a certain pixel in the image to be processed is 123456789 (32 bits). After converting it to binary, it becomes a 32-bit binary number. Then, the first 8 bits of this 32-bit binary number are truncated as the second floating-point number of the pixel in the first branch image. This second floating-point number is 7 after being converted to decimal. Then, the middle 16 bits of this 32-bit binary number are truncated as the second floating-point number of the pixel in the second branch image. This second floating-point number is 23501 after being converted to decimal. Finally, the last 8 bits of this 32-bit binary number are truncated as the second floating-point number of the pixel in the third branch image. This second floating-point number is 21 after being converted to decimal.
[0081] It is important to note that Figure 2 The 32-bit floating-point number of a pixel can also be split in other ways. For example, the first 16 bits of its corresponding 32-bit binary number can be used as the second floating-point number of that pixel in the first branch image, and the last 16 bits can be used as the second floating-point number of that pixel in the second branch image. Another example is using all 32 bits of its corresponding 32-bit binary number as the second floating-point number of that pixel in the first branch image, and the last 16 bits as the second floating-point number of that pixel in the second branch image. Yet another example is using the first 16 bits of its corresponding 32-bit binary number as the second floating-point number of that pixel in the first branch image, bits 9 to 24 as the second floating-point number of that pixel in the second branch image, and the last 16 bits as the second floating-point number of that pixel in the third branch image.
[0082] In step S103, the multiple branch images are respectively input to the corresponding image processing models, and each image processing model outputs the processing result of the corresponding branch image.
[0083] The image processing model can be pre-trained, and the image processing model is matched with multiple consecutive bits of the corresponding image to be processed. In other words, the processing content or processing task of the image processing model needs to match the characteristics of the branch image.
[0084] The lower bits of a pixel's floating-point number mostly store high-frequency information, such as noise and texture, and are the main processing range for image enhancement. The middle and higher bits, however, store low-frequency information, such as grayscale and contours, and remain largely unchanged during image enhancement. Based on these characteristics of floating-point numbers, specific tasks can be assigned to the processing models for different branch images. The image processing models corresponding to the middle and higher bit branches perform fewer computations to preserve the original input information and stabilize the output image. The image processing models corresponding to the lower bit branches perform more computations, focusing on high-intensity denoising and deblurring. Furthermore, due to the presence of the middle and higher bit branches, they do not need to worry about information loss. The middle and higher bit branches (i.e., the processing of the middle and higher bit branches) have low computational cost, simple pruning, low learning difficulty, and almost no quantization error. Most of the computation is done in the lower bit branches (i.e., the processing of the lower bit branches), so the overall structure is fast without sacrificing accuracy.
[0085] by Figure 2Taking the splitting results shown as an example, the image processing models corresponding to the first and second branch images, namely the branch images with the second floating-point number of pixels being 7 and 23501, perform less computational processing. The purpose is to preserve the original input information and stabilize the output image effect. The image processing model corresponding to the third branch image, namely the branch image with the second floating-point number of pixels being 21, performs more computational processing. Its function is high-intensity denoising, deblurring, etc.
[0086] In step S104, the processing results of the multiple branch images are fused to obtain the processing result of the image to be processed.
[0087] The fusion process can be performed by reversing the splitting process in step S102, or by using a pre-trained fusion network.
[0088] Taking the reverse process of the splitting process as an example, the fusion can be performed as follows: First, the third floating-point number of the pixel of the processing result of each branch image is converted from decimal to N-ary; next, according to the bit range of each branch image, the third floating-point number of the pixel of the processing result of each branch image is added to the corresponding bit of the fourth floating-point number; finally, the fourth floating-point number is converted from N-ary to decimal, and the floating-point number of the pixel of the image to be processed is updated to the fourth floating-point number to obtain the processing result of the image to be processed. Here, N-ary is a non-decimal number, such as binary, octal, hexadecimal, etc. The bit range of the branch image is the same as the bit range of the branch image in the splitting process. Each pixel in the branch image is fused according to the method described in the above embodiment.
[0089] This disclosure acquires an image to be processed, splits it into multiple branch images, inputs each branch image into a corresponding image processing model, and each image processing model outputs a processing result for its corresponding branch image. Finally, the processing results of the multiple branch images are fused to obtain the final processing result for the image to be processed. Since the image information of each branch image includes image information represented by multiple consecutive bits from the image information of the image to be processed, each branch image can carry some characteristics of the image to be processed. Because each branch image is processed by a different image processing model, the processing content and intensity of the image processing module can be adapted to the image characteristics carried by its corresponding branch image, thereby ensuring the accuracy of image processing. Furthermore, compared to related technologies that perform all processing on the image to be processed, the image processing module in this application targets a smaller number of bits in the branch image and processes only a portion of the content, thereby reducing the power consumption of image processing and improving its efficiency. In other words, the image processing method provided in this application can perform image processing efficiently and with high precision with low power consumption. If applied to the photography scenario of terminal devices, it can eliminate image degradation phenomena such as noise, blur, and low resolution, and improve image quality, thus accelerating image enhancement processing.
[0090] Furthermore, the processing approach of splitting, processing, and fusing the image to be processed provided in this disclosure is applicable to any model for any image enhancement task. Moreover, it can adjust the distribution ratio between branches and the number of splits according to different tasks such as denoising, deblurring, and moiré removal, thereby adapting to different task objectives and making the solution more robust and generalizable. It can accelerate the image enhancement process during the terminal device's photo capture process, fully utilizing the advantages of deep learning technology to overcome the disadvantages of high power consumption and slow speed in image enhancement, eliminating noise, blurring, and other degradation phenomena in the terminal device's display, and improving image quality.
[0091] Please refer to the appendix. Figure 3This example exemplifies the complete process of image enhancement using the image processing method provided in this application. First, the 32-bit input image is split into three branches: a high 8-bit branch image, a middle 16-bit branch image, and a low 8-bit branch image. Each branch image forms an image processing branch, namely, a high 8-bit branch, a middle 16-bit branch, and a low 8-bit branch. In the high 8-bit branch, the input task of the high 8-bit branch image is to preserve information and output a stable 8-bit small model. In the middle 16-bit branch, the input task of the middle 16-bit image is to preserve information and also enhance the 16-bit medium model. In the low 8-bit branch, the input task of the low 8-bit branch image is to input a high-intensity image enhancement 8-bit large model. Finally, the processing results output by the models in the last three branches are fused to form a 32-bit output image. The output image is enhanced relative to the input image, and the process is efficient and accurate.
[0092] The acceleration method proposed in this embodiment uses a bit splitting strategy to divide high-bit information into multiple branches, assigning each branch different tasks and characteristics. This allows for targeted high-intensity enhancement of noise, details, and textures, while preserving grayscale and contours, thereby reducing quantization errors. Furthermore, by splitting and cropping high-bit data into low-bit data, compared to direct scaling, it ensures that the numerical range remains within the acceptable range of the corresponding branch, minimizing quantization errors.
[0093] In some embodiments of this disclosure, the training process of the image processing model includes the following steps: First, sample images and corresponding label images are obtained from the training set; next, the sample images are split into multiple sample branch images, and the label images are split into multiple label branch images, wherein each sample branch image includes multiple consecutive bits of the sample image, and each label branch image includes multiple consecutive bits of the label image; then, the multiple sample branch images are respectively input to the corresponding image processing model, and each image processing model outputs the processing result of the corresponding sample branch image; a first network loss value is determined based on the processing result of the sample branch image and the corresponding label branch image, and the network parameters of the corresponding image processing model are adjusted based on the first network loss value.
[0094] In this model, the sample image and the label image represent images of the same scene but with different qualities; the sample image has lower quality, while the label image has higher quality. The label image is the target image for processing. The process of splitting the sample image can be the same as or different from the process of splitting the image to be processed described in step S102. The label branch image is generated in the same way as the sample branch image to ensure information equivalence. This allows the model to correctly recognize its task objective, ensuring that the final output meets our requirements.
[0095] When training image processing models for different branches, different hyperparameters can be set, such as different learning rates and different number of iterations.
[0096] In addition, after training the image processing model for each branch is completed, the processing results of multiple sample branch images can be fused to obtain the processing result of the sample image; then, a second network loss value is determined based on the processing result of the sample image and the corresponding label image, and the network parameters of each image processing model are adjusted based on the second network loss value.
[0097] The fusion process of the sample branch image processing results can be the same as the fusion process of the branch image processing results described in step S104, or it can be different from the fusion process of the branch image processing results described in step S104.
[0098] This embodiment provides a multi-precision, multi-stage model training method. The multi-stage approach involves first training each branch independently to adapt it to its unique task and achieve convergence. Then, multiple branches are merged and trained jointly. This ensures model convergence while correctly guiding each branch to handle its specific task. Furthermore, each branch can apply hyperparameters more suitable for its task during individual training, leading to better and faster model convergence. This multi-stage training method enables better and faster model convergence, thereby accelerating training speed, allowing for more frequent model iterations, and providing a better training experience.
[0099] Multi-precision quantization involves splitting the labeled image into multiple labeled branch images, each serving as the label for the image processing model of its respective branch. This allows the model to correctly recognize its task objective, ensuring the final output meets our requirements. Compared to traditional single-precision quantization, this approach proposes a parallel hybrid precision structure, assigning different tasks, data, and training parameters to different branches. This makes the model more targeted, improving final image quality and enhancing the user experience by preserving information and reducing quantization errors.
[0100] In some embodiments of this disclosure, after the image processing model has been trained, the corresponding image processing model can be quantized based on each sample branch image.
[0101] Because the parallel mixed-precision architecture splits the input, the quantized data during actual static quantization also needs to be split, and each branch needs to be quantized independently. By splitting and cropping the data to transform high-bit data into low-bit data, compared to direct scaling, it is possible to ensure that the numerical range is within the acceptable range of the corresponding branch, minimizing quantization error. After each branch's image processing model is quantized individually, they are then combined and stitched together to form the final model.
[0102] According to a second aspect of the embodiments of this disclosure, an image processing apparatus is provided; please refer to the appendix. Figure 4 include:
[0103] The acquisition module 401 is used to acquire the image to be processed;
[0104] The splitting module 402 is used to split the image to be processed into multiple branch images, wherein the image information of the branch images includes image information represented by multiple consecutive bits in the image information of the image to be processed;
[0105] The processing module 403 is used to input the multiple branch images into the corresponding image processing models respectively, and each image processing model outputs the processing result of the corresponding branch image;
[0106] The fusion module 404 is used to fuse the processing results of multiple branch images to obtain the processing result of the image to be processed.
[0107] In some embodiments of this disclosure, the splitting module is used for:
[0108] Convert the first floating-point number of the pixels in the image to be processed from decimal to N-ary;
[0109] Each branch image is truncated from the first floating-point number according to the bit range of each branch image, and the N-ary second floating-point number composed of the consecutive bits is converted into decimal;
[0110] The floating-point values of the pixels in the image to be processed are updated to the second floating-point value corresponding to the branch image to obtain the branch image.
[0111] In some embodiments of this disclosure, the union of the bit ranges of the plurality of branch images is the total number of bits of the image to be processed, and the sum of the widths of the bit ranges of the plurality of branch images is greater than or equal to the bit width of the image to be processed.
[0112] In some embodiments of this disclosure, the image processing model is matched with a plurality of consecutive bits of the corresponding image to be processed.
[0113] In some embodiments of this disclosure, the fusion module is used for:
[0114] The third floating-point number of the pixel of the processing result of each branch image is converted from decimal to N-ary;
[0115] According to the bit range of each branch image, the third floating-point number of the pixel of the processing result of each branch image is added to the corresponding bit of the fourth floating-point number;
[0116] The fourth floating-point number is converted from N-ary to decimal, and the floating-point number of the pixels of the image to be processed is updated to the fourth floating-point number to obtain the processing result of the image to be processed.
[0117] In some embodiments of this disclosure, a first training module is included, for:
[0118] Obtain sample images and corresponding label images from the training set;
[0119] The sample image is split into multiple sample branch images, and the label image is split into multiple label branch images, wherein each sample branch image includes multiple consecutive bits of the sample image, and each label branch image includes multiple consecutive bits of the label image;
[0120] The sample branch images are respectively input into the corresponding image processing model, and each image processing model outputs the processing result of the corresponding sample branch image;
[0121] A first network loss value is determined based on the processing result of the sample branch image and the corresponding label branch image, and the network parameters of the corresponding image processing model are adjusted based on the first network loss value.
[0122] In some embodiments of this disclosure, a second training module is included, for:
[0123] The processing results of multiple sample branch images are fused together to obtain the processing result of the sample image;
[0124] A second network loss value is determined based on the processing result of the sample image and the corresponding label image, and the network parameters of each image processing model are adjusted based on the second network loss value.
[0125] In some embodiments of this disclosure, a quantization module is included, for:
[0126] The corresponding image processing model is quantized based on each sample branch image.
[0127] Regarding the apparatus in the above embodiments, the specific manner in which each module performs its operation has been described in detail in the embodiments of the method in the first aspect, and will not be elaborated upon here.
[0128] According to a third aspect of the embodiments of this disclosure, please refer to the appendix. Figure 5 The diagram illustrates, for example, a block diagram of an electronic device. For instance, device 500 could be a mobile phone, computer, digital broadcasting terminal, messaging device, game console, tablet device, medical device, fitness equipment, personal digital assistant, etc.
[0129] Reference Figure 5 The device 500 may include one or more of the following components: a processing component 502, a memory 504, a power supply component 506, a multimedia component 508, an audio component 510, an input / output (I / O) interface 512, a sensor component 514, and a communication component 516.
[0130] Processing component 502 typically controls the overall operation of device 500, such as operations associated with display, telephone calls, data communication, camera operation, and recording. Processing component 502 may include one or more processors 520 to execute instructions to perform all or part of the steps of the methods described above. Furthermore, processing component 502 may include one or more modules to facilitate interaction between processing component 502 and other components. For example, processing component 502 may include a multimedia module to facilitate interaction between multimedia component 508 and processing component 502.
[0131] Memory 504 is configured to store various types of data to support the operation of device 500. Examples of this data include instructions for any application or method operating on device 500, contact data, phonebook data, messages, pictures, videos, etc. Memory 504 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk.
[0132] The power supply component 506 provides power to the various components of the device 500. The power supply component 506 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power to the device 500.
[0133] Multimedia component 508 includes a screen that provides an output interface between the device 500 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touchscreen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensors may not only sense the boundaries of the touch or swipe action but also detect the duration and pressure associated with the touch or swipe operation. In some embodiments, multimedia component 508 includes a front-facing camera and / or a rear-facing camera. When the device 500 is in an operating mode, such as a shooting mode or a video mode, the front-facing camera and / or the rear-facing camera may receive external multimedia data. Each front-facing camera and rear-facing camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
[0134] Audio component 510 is configured to output and / or input audio signals. For example, audio component 510 includes a microphone (MIC) configured to receive external audio signals when device 500 is in an operating mode, such as call mode, recording mode, and voice recognition mode. The received audio signals may be further stored in memory 504 or transmitted via communication component 516. In some embodiments, audio component 510 includes a speaker for outputting audio signals.
[0135] I / O interface 512 provides an interface between processing component 502 and peripheral interface modules, such as keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to, home buttons, volume buttons, power buttons, and lock buttons.
[0136] Sensor assembly 514 includes one or more sensors for providing state assessments of various aspects of device 500. For example, sensor assembly 514 may detect the on / off state of device 500, the relative positioning of components such as the display and keypad of device 500, changes in the position of device 500 or a component of device 500, the presence or absence of user contact with device 500, the orientation or acceleration / deceleration of device 500, and temperature changes of device 500. Sensor assembly 514 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 514 may include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, sensor assembly 514 may include an accelerometer, a gyroscope, a magnetometer, a pressure sensor, or a temperature sensor.
[0137] Communication component 516 is configured to facilitate wired or wireless communication between device 500 and other devices. Device 500 can access wireless networks based on communication standards, such as WiFi, 2G or 3G, 4G or 5G, or combinations thereof. In one exemplary embodiment, communication component 516 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, communication component 516 includes a near-field communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
[0138] In an exemplary embodiment, the device 500 may be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to perform the power supply method of the aforementioned electronic device.
[0139] Fourthly, in exemplary embodiments, this disclosure provides a non-transitory computer-readable storage medium including instructions, such as a memory 504 including instructions, which can be executed by a processor 520 of device 500 to complete the power supply method of the aforementioned electronic device. For example, the non-transitory computer-readable storage medium may be a ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, and optical data storage device, etc.
[0140] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are indicated by the following claims.
[0141] It should be understood that this disclosure is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this disclosure is limited only by the appended claims.
Claims
1. An image processing method, characterized in that, include: Obtain the image to be processed; The image to be processed is split into multiple branch images, wherein the image information of each branch image includes image information represented by a series of consecutive bits in the image information of the image to be processed; splitting the image to be processed into multiple branch images includes: converting the first floating-point number of the pixels of the image to be processed from decimal to N-ary; extracting a series of consecutive bits from the first floating-point number according to the bit range of each branch image, and converting the N-ary floating-point number composed of the series of consecutive bits into decimal; updating the floating-point number of the pixels of the image to be processed to the second floating-point number corresponding to the branch image to obtain the branch image; The multiple branch images are respectively input into the corresponding image processing model, and each image processing model outputs the processing result of the corresponding branch image; The processing results of multiple branch images are fused to obtain the processing result of the image to be processed, wherein the fusion process is performed based on the bit range corresponding to each branch image.
2. The image processing method according to claim 1, characterized in that, The union of the bit ranges of the multiple branch images is the total number of bits of the image to be processed; and / or, The sum of the widths of the bit ranges of the multiple branch images is greater than or equal to the bit width of the image to be processed.
3. The image processing method according to claim 1, characterized in that, The image processing model is matched with multiple consecutive bits of the corresponding image to be processed.
4. The image processing method according to claim 1, characterized in that, The step of fusing the processing results of multiple branch images to obtain the processing result of the image to be processed includes: The third floating-point number of the pixel of the processing result of each branch image is converted from decimal to N-ary; According to the bit range of each branch image, the third floating-point number of the pixel of the processing result of each branch image is added to the corresponding bit of the fourth floating-point number; The fourth floating-point number is converted from N-ary to decimal, and the floating-point number of the pixels of the image to be processed is updated to the fourth floating-point number to obtain the processing result of the image to be processed.
5. The image processing method according to claim 1, characterized in that, include: Obtain sample images and corresponding label images from the training set; The sample image is split into multiple sample branch images, and the label image is split into multiple label branch images, wherein each sample branch image includes multiple consecutive bits of the sample image, and each label branch image includes multiple consecutive bits of the label image; The sample branch images are respectively input into the corresponding image processing model, and each image processing model outputs the processing result of the corresponding sample branch image; A first network loss value is determined based on the processing result of the sample branch image and the corresponding label branch image, and the network parameters of the corresponding image processing model are adjusted based on the first network loss value.
6. The image processing method according to claim 5, characterized in that, include: The processing results of multiple sample branch images are fused together to obtain the processing result of the sample image; A second network loss value is determined based on the processing result of the sample image and the corresponding label image, and the network parameters of each image processing model are adjusted based on the second network loss value.
7. The image processing method according to claim 5 or 6, characterized in that, include: The corresponding image processing model is quantized based on each sample branch image.
8. An image processing apparatus, characterized in that, include: The acquisition module is used to acquire the image to be processed; A splitting module is used to split the image to be processed into multiple branch images, wherein the image information of each branch image includes image information represented by a series of consecutive bits in the image information of the image to be processed; the splitting module is used to: convert the first floating-point number of the pixels of the image to be processed from decimal to N-ary; extract a series of consecutive bits from the first floating-point number according to the bit range of each branch image, and convert the N-ary floating-point number composed of the series of consecutive bits into decimal; update the floating-point number of the pixels of the image to be processed to the second floating-point number corresponding to the branch image, thereby obtaining the branch image; The processing module is used to input the multiple branch images into the corresponding image processing models respectively, and each image processing model outputs the processing result of the corresponding branch image; The fusion module is used to fuse the processing results of multiple branch images to obtain the processing result of the image to be processed, wherein the fusion process is performed based on the bit range corresponding to each branch image.
9. The image processing apparatus according to claim 8, characterized in that, The union of the bit ranges of the multiple branch images is the total number of bits of the image to be processed; and / or, The sum of the widths of the bit ranges of the multiple branch images is greater than or equal to the bit width of the image to be processed.
10. The image processing apparatus according to claim 8, characterized in that, The image processing model is matched with multiple consecutive bits of the corresponding image to be processed.
11. The image processing apparatus according to claim 8, characterized in that, The fusion module is used for: The third floating-point number of the pixel of the processing result of each branch image is converted from decimal to N-ary; According to the bit range of each branch image, the third floating-point number of the pixel of the processing result of each branch image is added to the corresponding bit of the fourth floating-point number; The fourth floating-point number is converted from N-ary to decimal, and the floating-point number of the pixels of the image to be processed is updated to the fourth floating-point number to obtain the processing result of the image to be processed.
12. The image processing apparatus according to claim 8, characterized in that, Includes the first training module, used for: Obtain sample images and corresponding label images from the training set; The sample image is split into multiple sample branch images, and the label image is split into multiple label branch images, wherein each sample branch image includes multiple consecutive bits of the sample image, and each label branch image includes multiple consecutive bits of the label image; The sample branch images are respectively input into the corresponding image processing model, and each image processing model outputs the processing result of the corresponding sample branch image; A first network loss value is determined based on the processing result of the sample branch image and the corresponding label branch image, and the network parameters of the corresponding image processing model are adjusted based on the first network loss value.
13. The image processing apparatus according to claim 12, characterized in that, Includes a second training module for: The processing results of multiple sample branch images are fused together to obtain the processing result of the sample image; A second network loss value is determined based on the processing result of the sample image and the corresponding label image, and the network parameters of each image processing model are adjusted based on the second network loss value.
14. The image processing apparatus according to claim 12 or 13, characterized in that, Includes a quantization module, used for: The corresponding image processing model is quantized based on each sample branch image.
15. An electronic device, characterized in that, The electronic device includes a memory and a processor. The memory is used to store computer instructions that can be executed on the processor, and the processor is used to execute the computer instructions based on the image processing method according to any one of claims 1 to 7.
16. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the method of any one of claims 1 to 7.