Image processing method and device, electronic equipment and storage medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using camera shake and image rendering parameter network inference, high-resolution images are generated, solving the problems of image detail loss and frame rate reduction, and achieving a balance between image quality and frame rate.

CN117408884BActive Publication Date: 2026-06-23VIVO MOBILE COMM CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: VIVO MOBILE COMM CO LTD
Filing Date: 2023-11-20
Publication Date: 2026-06-23

Application Information

Patent Timeline

20 Nov 2023

Application

23 Jun 2026

Publication

CN117408884B

IPC: G06T3/4076; G06T3/4023; G06T3/4046; G06T5/50; G06N3/0464

AI Tagging

Application Domain

Image enhancement Geometric image transformation

Technology Topics

Imaging processing Image resolution

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

In video games, there is an inverse relationship between image quality and frame rate, which leads to a decrease in frame rate when maintaining high image quality. Existing super-resolution processing methods result in the loss of image details, affecting image quality.

Method used

The system uses camera shaking to render images of the virtual camera's field of view at different locations. Through upsampling and fusion processing, it uses image rendering parameters for network inference to obtain sampling and fusion parameters and generate high-resolution images.

Benefits of technology

It effectively reduces the loss of image details and improves image quality, while maintaining or increasing the game frame rate. It achieves computing power allocation through an independent image processing pipeline and external chip, avoiding the occupation of rendering pipeline resources.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN117408884B_ABST

Patent Text Reader

Abstract

The application discloses an image processing method and device, electronic equipment and storage medium, and belongs to the technical field of image processing. The method comprises the following steps: rendering a first image and a second image, wherein the first image and the second image are obtained by rendering a field of view of a virtual camera based on the first motion information, the resolution of the first image and the resolution of the second image are both the first resolution; determining the sampling parameters and the fusion parameters of the first image and the second image; performing up-sampling processing on the first image and the second image based on the sampling parameters to obtain a third image and a fourth image, wherein the resolution of the third image and the fourth image is the second resolution, and the second resolution is greater than the first resolution; and performing fusion processing on the third image and the fourth image according to the fusion parameters to obtain a fifth image, and the resolution of the fifth image is the second resolution.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application belongs to the field of image processing technology, and specifically relates to an image processing method, apparatus, electronic device and storage medium. Background Technology

[0002] In related technologies, for video games, both the rendered image quality and the final frame rate have a significant impact on the gaming experience. With the computing power of the electronic device remaining constant, image quality and frame rate are inversely proportional; that is, the higher the image quality, the lower the frame rate.

[0003] To achieve a balanced experience between frame rate and image quality, the rendering resolution can be reduced during rendering to save computing power, and the resulting low-resolution image can be super-resolution processed to improve image quality. However, low-resolution images lose some image details, and due to the loss of this detailed information during super-resolution processing, the processed image still lacks image details, resulting in a degraded image quality. Summary of the Invention

[0004] The purpose of this application is to provide an image processing method, apparatus, electronic device, and storage medium that can solve the problem of image quality degradation caused by loss of image details.

[0005] In a first aspect, embodiments of this application provide an image processing method, the method comprising:

[0006] Render a first image and a second image, wherein the first image and the second image are rendered based on the field of view of the virtual camera when the field of view of the virtual camera moves with the first motion information, and the resolution of the first image and the resolution of the second image are both the first resolution.

[0007] Determine the sampling parameters and fusion parameters of the first and second images;

[0008] Based on the sampling parameters, the first image and the second image are upsampled to obtain the processed third image and the fourth image. The resolution of the third image and the fourth image is the second resolution, which is greater than the first resolution.

[0009] The third and fourth images are fused according to the fusion parameters to obtain the processed fifth image, which has the same resolution as the second image.

[0010] Secondly, embodiments of this application provide an image processing apparatus, which includes:

[0011] The rendering module is used to render a first image and a second image, wherein the first image and the second image are rendered based on the field of view of the virtual camera when the field of view of the virtual camera moves with the first motion information, and the resolution of the first image and the resolution of the second image are both the first resolution.

[0012] The determination module is used to determine the sampling parameters and fusion parameters of the first image and the second image;

[0013] The processing module is used to upsample the first and second images based on sampling parameters to obtain processed third and fourth images, wherein the resolution of the third and fourth images is a second resolution, which is greater than the first resolution; and

[0014] The third and fourth images are fused according to the fusion parameters to obtain the processed fifth image, which has the same resolution as the second image.

[0015] Thirdly, embodiments of this application provide an electronic device including a processor and a memory, the memory storing a program or instructions that can run on the processor, the program or instructions implementing the steps of the method as described in the first aspect when executed by the processor.

[0016] Fourthly, embodiments of this application provide a readable storage medium on which a program or instructions are stored, which, when executed by a processor, implement the steps of the method as described in the first aspect.

[0017] Fifthly, embodiments of this application provide a chip including a processor and a communication interface coupled to the processor, the processor being used to run programs or instructions to implement the steps of the method as described in the first aspect.

[0018] In a sixth aspect, embodiments of this application provide a computer program product stored in a storage medium, which is executed by at least one processor to implement the method as described in the first aspect.

[0019] In this embodiment, by employing camera shake, the rendering pipeline renders first and second images when the virtual camera's field of view is at different positions, allowing the first and second images to include different image information and complement each other. During the upsampling process of super-resolution processing, the first and second images are upsampled separately, and the image details recorded in the upsampled, higher-resolution third and fourth images are fused together, resulting in a high-resolution image with more image details. This effectively reduces the loss of image details and the degradation of image quality. Attached Figure Description

[0020] Figure 1Flowcharts illustrating image processing methods of some embodiments of this application are shown;

[0021] Figure 2 The following is a rendered schematic diagram illustrating camera shake in some embodiments of this application;

[0022] Figure 3 A schematic diagram of an information processing network according to some embodiments of this application is shown;

[0023] Figure 4 Structural block diagrams of image processing apparatuses according to some embodiments of this application are shown;

[0024] Figure 5 A structural block diagram of an electronic device according to an embodiment of this application is shown;

[0025] Figure 6 A schematic diagram of the hardware structure of an electronic device to implement an embodiment of this application. Detailed Implementation

[0026] The technical solutions of the embodiments of this application will be clearly described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this application. All other embodiments obtained by those skilled in the art based on the embodiments of this application are within the scope of protection of this application.

[0027] The terms "first," "second," etc., used in the specification and claims of this application are used to distinguish similar objects and not to describe a specific order or sequence. It should be understood that such terms can be used interchangeably where appropriate so that embodiments of this application can be implemented in orders other than those illustrated or described herein, and the objects distinguished by "first," "second," etc., are generally of the same class and the number of objects is not limited; for example, a first object can be one or more. Furthermore, in the specification and claims, "and / or" indicates at least one of the connected objects, and the character " / " generally indicates that the preceding and following objects are in an "or" relationship.

[0028] The image processing method, apparatus, electronic device, and storage medium provided in this application will be described in detail below with reference to the accompanying drawings and through specific embodiments and application scenarios.

[0029] In some embodiments of this application, an image processing method is provided. Figure 1 Flowcharts illustrating image processing methods of some embodiments of this application are shown, such as... Figure 1 As shown, the image processing methods include:

[0030] Step 102: Render the first image and the second image.

[0031] The first image and the second image are rendered based on the field of view of the virtual camera when the field of view of the virtual camera moves with the first motion information. The resolution of the first image and the resolution of the second image are both the first resolution.

[0032] In this embodiment, the first image and the second image are low-resolution images rendered using a camera shake method. The camera shake method is an image sampling method. Specifically, when rendering low-resolution images, the conventional approach is to keep the virtual camera's field of view constant and collect pixel values within that field of view. Because the pixel density is fixed, this leads to the loss of image information in the inter-pixel regions.

[0033] To address the aforementioned issues, this application embodiment controls the field of view of the virtual camera to move according to preset first motion information, thereby rendering multiple frames of images that are sequentially continuous but have different field of view positions, specifically including the aforementioned first image and second image.

[0034] The virtual camera is the user's viewpoint. The virtual camera's field of view determines the range of the game scene that the user can see in the game. Objects within the virtual camera's field of view are rendered as image objects and become part of the final on-screen image.

[0035] For example, Figure 2 The following are rendered schematic diagrams illustrating camera shake in some embodiments of this application, such as... Figure 2 As shown, the virtual camera's field of view shifts in different directions according to preset first motion information, resulting in multiple images recording different pixel information, such as... Figure 2 The first image 202 and the second image 204 are shown.

[0036] like Figure 2 As shown, the first image 202 records multiple pixels 2022, but the information of the interval regions between the pixels 2022 is not recorded. However, the field of view of the second image 204 is different from that of the first image 202, and the multiple pixels 2042 recorded in the second image 204 include the information of the interval regions between the pixels 2022 of the first image 202.

[0037] In other words, the first image 202 and the second image 204 record different scene details. By combining the details recorded in the first image 202 and the second image 204, the rendered image can include more details, providing more sequence information for super-resolution upsampling, thereby preserving more scene details in the high-resolution image obtained by super-resolution processing.

[0038] Step 104: Determine the sampling parameters and fusion parameters of the first image and the second image.

[0039] In this embodiment of the application, during the super-resolution processing, each pixel in the image is sampled according to a certain sampling radius, thereby filling pixels and supplementing the low-resolution image into the high-resolution image.

[0040] In this embodiment, the first image and the second image are rendered by camera shaking. Since the first image and the second image have different field of view positions and record different image details, it is also necessary to determine the fusion parameters. Based on the fusion parameters, the image details in the first image and the second image are fused to obtain a high-resolution image with more image details.

[0041] Step 106: Based on the sampling parameters, upsample the first and second images to obtain the processed third and fourth images.

[0042] Among them, the resolution of the third and fourth images is the second resolution, which is greater than the first resolution.

[0043] In this embodiment, upsampling processing is performed on the first image and the second image respectively using sampling parameters. Specifically, the upsampling processing is a process of filling pixels in the low-resolution image, thereby supplementing the low-resolution image into the high-resolution image.

[0044] For example, the first image and the second image are rendered at a first resolution of 960×540. Upsampling processing can super-resolution the first image and the second image to a second resolution of 1920×1080, resulting in a third image and a fourth image with a resolution of 1920×1080.

[0045] For example, the first image and the second image are rendered at a first resolution of 1280×720. Upsampling processing can super-resolution the first image and the second image to a second resolution of 2560×1440, resulting in a third image and a fourth image with a resolution of 2560×1440.

[0046] With an original rendering resolution of 1920×1080, reducing the original resolution to 960×540 during rendering reduces the number of pixels to be rendered from 2,073,600 to 518,400, which is equivalent to rendering only a quarter of the pixels. Therefore, the computing power required to render the image is greatly reduced. This saved computing power can be allocated to rendering at a higher frame rate, thereby improving the game's frame rate.

[0047] Lower resolution means lower image quality. Here, we use upsampling to over-process the 960×540 low-resolution image into a 1920×1080 high-resolution image, so that the final image quality displayed on the screen is close to the original rendering resolution, thus achieving a balance between image quality and frame rate.

[0048] Step 108: Perform fusion processing on the third and fourth images according to the fusion parameters to obtain the processed fifth image, the resolution of the fifth image being the second resolution.

[0049] In this embodiment of the application, in view of the problem that super-resolution methods in the prior art will cause loss of image details, this embodiment of the application adopts camera shaking to control the field of view of the virtual camera to move according to the preset first motion information, thereby rendering multiple frames of images that are continuous in time but have different field of view positions.

[0050] Because these images have different field-of-view positions, they record different levels of detail. After super-resolution processing, the third and fourth images record different levels of detail. Therefore, based on determined fusion parameters, the third and fourth images are fused, resulting in a fifth image that includes details recorded in both the third and fourth images. Thus, the fifth image retains more detail.

[0051] This application embodiment employs camera shaking to render first and second images of the virtual camera at different positions in the rendering pipeline. This allows the first and second images to include different image information, complementing each other. During the upsampling process of super-resolution processing, the first and second images are upsampled separately, and the image details recorded in the upsampled, higher-resolution third and fourth images are fused together. This results in a high-resolution image with more detail, effectively reducing the loss of image detail and image quality degradation.

[0052] In some embodiments of this application, rendering the first image and the second image includes:

[0053] The image rendering parameters are determined based on the field of view of the virtual camera and the first motion information, wherein the image rendering parameters include the image information to be rendered, depth information and motion vector information;

[0054] The first and second images are obtained by rendering based on the image rendering parameters;

[0055] Determining the sampling parameters and fusion parameters of the first and second images includes:

[0056] The image rendering parameters are processed to obtain the processed network input data.

[0057] The network input data is fed into the information processing network to obtain sampling parameters and fusion parameters.

[0058] In the embodiments of this application, the electronic device renders the first image and the second image in the rendering pipeline. Specifically, in the rendering pipeline, the first image and the second image are rendered by rendering techniques such as rasterization or ray tracing, and the image rendering parameters during the rendering process are saved, including the image information to be rendered, depth information and motion vector information.

[0059] The rendered image information refers to the scene, objects, and other information to be rendered within the field of view of the virtual camera. Motion vector information specifically indicates how a pixel moves in two consecutive frames of images. Specifically, because the virtual camera's field of view moves according to preset first motion information, the field of view positions of the first and second images are different, and the positions of pixels recording the same object information are also different in the two images. By recording motion vector information, it is possible to match the same pixel content in two consecutive frames, such as the first and second images.

[0060] Depth information is specifically used to determine whether objects in two temporally adjacent frames are occluded, thereby enabling more accurate matching of identical pixels in the two frames.

[0061] After rendering the first and second images, the rendered image information generated during the above rendering process is preprocessed to obtain network input data such as brightness information, speed information and occlusion information that contain more effective information, thereby enabling better fusion of consecutive frames.

[0062] Figure 3 Schematic diagrams of information processing networks according to some embodiments of this application are shown, such as... Figure 3 As shown, the preprocessed network input data (Input) is input into the following... Figure 3 In the information processing network shown, specific sampling parameters and fusion parameters are obtained through network inference. Specifically, the dimension of the network input data is <1×3×540×960>, and the information processing network includes 3 convolutional layers, i.e. Figure 3The diagram shows Conv1, Conv2, and Conv3. The kernel W of convolutional layer Conv1 has a size of <32×3×3×3>, and the bias term B of convolutional layer Conv1 has a size of [missing information]. <32> The kernel W of convolutional layer Conv2 has a size of <32×3×3×3>, and the bias term B of convolutional layer Conv2 has a size of [missing information]. <32> The kernel W of convolutional layer Conv3 has a size of <4×32×3×3>, and the bias term B of convolutional layer Conv1 has a size of... <4> The size of the convolution kernel W and the size of the bias term B in the convolutional layer can be selected according to actual needs, and this application embodiment does not impose specific limitations on them.

[0063] Convolutional layers Conv1 and Conv2, as well as Conv2 and Conv3, are connected through the ReLU function, which is a linear rectified function.

[0064] After the network input data passes through 3 convolutional layers and 2 ReLU functions, and is processed by the sigmoid activation function, the depth dimension (Ddepth) data is finally transferred to the spatial dimension (Space) through the DdepthToSpace layer, resulting in the final output with a dimension of <1×1×1080×1920>, which is the sampling parameters and fusion parameters.

[0065] This application embodiment performs network preprocessing on the image rendering parameters generated during the image rendering process. By inferring the sampling parameters and fusion parameters through the processing network, better upsampling effect and better fusion effect can be obtained, so that the final high-resolution image contains more picture details and improves image quality.

[0066] In some embodiments of this application, rendering the first image and the second image includes:

[0067] Render the first and second images using the image rendering pipeline;

[0068] After determining the sampling parameters and fusion parameters of the first and second images, the method further includes:

[0069] The first image, the second image, sampling parameters, and fusion parameters are sent to the image processing pipeline through the image rendering pipeline.

[0070] Based on the sampling parameters, upsampling processing is performed on the first and second images, including:

[0071] The first and second images are upsampled using the image processing pipeline;

[0072] The third and fourth images are fused according to the fusion parameters, including:

[0073] The third and fourth images are fused using the image processing pipeline; and

[0074] The method also includes:

[0075] The fifth image is sent to the image rendering pipeline via the image processing pipeline.

[0076] In this embodiment of the application, the image rendering pipeline is specifically the rendering pipeline running in the graphics processing unit (GPU). The rendering pipeline is used to render images, such as game images, user interface images, etc., wherein the first image and the second image are both rendered by the image rendering pipeline.

[0077] In a game scenario, the efficiency of the rendering pipeline determines the number of frames that can be generated per unit of time, which in turn determines the game's frame rate. To ensure that the rendering pipeline's efficiency is not additionally occupied, this application embodiment sets up an image processing pipeline.

[0078] The image processing pipeline is used to upsample the low-resolution image rendered by the image rendering pipeline to obtain a fused high-resolution image. In some embodiments, the image processing pipeline can be a pipeline running in the neural network processing unit (NPU) of an electronic device, or it can be run by setting up a separate external computing chip.

[0079] After the image rendering pipeline renders the first and second images, the first and second images, along with the sampling and fusion parameters generated during the rendering of the first and second images, are sent to the image processing pipeline. The image processing pipeline upsamples the first and second images and then fuses the upsampled third and fourth images to obtain a fifth image with higher resolution and more image details.

[0080] During this process, the image rendering pipeline can continue to render images, so the frame rate will not decrease due to upsampling, image fusion and other processing consuming GPU computing power.

[0081] After the image processing pipeline processes the fifth image, it is sent back to the image rendering pipeline. The image rendering pipeline continues to process the fifth image, such as overlaying user interfaces and icons, to obtain a final image frame and display it on the screen.

[0082] This application embodiment sets up an independent image processing pipeline to perform upsampling and fusion processing on the images rendered in the image rendering pipeline. Therefore, it does not occupy the computing power originally used to render the images, thus ensuring the efficiency of image rendering and guaranteeing the game frame rate while ensuring the game visuals.

[0083] In some embodiments of this application, the third image includes a first pixel, and the sampling parameters include a sampling radius;

[0084] The third and fourth images are fused according to the fusion parameters to obtain the processed fifth image, which includes:

[0085] Based on the pixel position and motion vector information of the first pixel in the third image, determine the pixel position of the first pixel in the fourth image;

[0086] Based on the sampling radius and the pixel position of the first pixel in the third image, the region of the first image is determined in the third image;

[0087] Based on the sampling radius and the pixel position of the first pixel in the fourth image, a second image region is determined in the fourth image, and the image content in the second image region is the same as the image content in the first image region.

[0088] The first and second image regions are fused based on the fusion parameters to obtain the fifth image.

[0089] In this embodiment, the sampling parameter includes the sampling radius. Specifically, during super-resolution processing, each pixel in the image is sampled according to a certain sampling radius to fill pixels and supplement the low-resolution image into the high-resolution image. In related technologies, the same sampling radius is used for different pixels in the image. When the pixel is in a corner of the image, the sampling radius will cover a large amount of invalid area, resulting in a waste of computing power.

[0090] This application embodiment uses network inference to obtain a suitable sampling radius, and adaptively obtains a suitable sampling radius for different pixels, thereby enabling more efficient reproduction of fine details. By sensing the content around the sampling point, it determines whether the pixel is at the edge of the image, and thus adjusts different sampling radii for different image content.

[0091] In this embodiment, the third image is obtained by super-resolution upsampling of the first image, and the fourth image is obtained by super-resolution upsampling of the second image. The first image and the second image are two temporally adjacent image frames. Here, we will illustrate this by taking the example that the first image is the next frame after the second image.

[0092] The first pixel can be any pixel in the third image. For each pixel in the third image, based on its pixel position in the third image and the saved motion vector information, the position of the first pixel in the next frame, the fourth image, is tracked, thereby achieving marker tracking of the same image frames in the third and fourth images.

[0093] Then, based on the determined sampling radius corresponding to the first pixel, the surrounding pixels of the first pixel in the third image and the first pixel in the fourth image are sampled respectively, to obtain the first image region in the third image and the second image region in the fourth image. The image content of the first image region and the second image region is the same, that is, the content within the same virtual camera field of view.

[0094] This achieves the alignment of content in two temporally consecutive image frames. Merging the aligned first and second image regions allows for the complementarity of different image details recorded in the two frames, resulting in a fifth image containing more image details.

[0095] The embodiments of this application can adaptively obtain the sampling radius for different pixels, which can reproduce fine details well. It can perceive the content around the sampling point, such as whether it is an edge, and adjust the sampling radius according to different image content. At the same time, according to the fusion parameters, it can perform better temporal fusion of frame images from different time sequences. The fusion granularity is at the pixel level, thus achieving a more refined fusion effect.

[0096] In some embodiments of this application, the image processing method is performed by an electronic device, which includes an image processing chip for generating a fifth image;

[0097] After rendering the first and second images based on image rendering parameters, the method further includes:

[0098] The image rendering parameters, the first image, and the second image are sent to the image processing chip.

[0099] In this embodiment of the application, the electronic device includes an image processing chip, which can be an external chip that does not occupy the computing power of the electronic device's own central processing unit (CPU) or graphics processing unit (GPU) to perform super-resolution processing, thereby saving more computing power to improve the game frame rate.

[0100] Specifically, it is assumed that the module performing the super-resolution and image fusion process is defined as the VNSS module, and all operations of the VNSS module are performed on the image processing chip. Specifically, after rendering the first image and the second image, the rendered first image and the second image, as well as the image rendering parameters generated during the rendering process, including the image information to be rendered, depth information, and motion vector information, are all sent to the image processing chip.

[0101] The image processing chip performs super-resolution upsampling and fusion processing to obtain a fifth image with more image details and higher resolution.

[0102] After obtaining the fifth image, the image processing chip sends the fifth image back to the original rendering pipeline. In the rendering pipeline, the fifth image at the display resolution is post-processed to obtain the target image and displayed on the screen.

[0103] This application embodiment uses an external independent image processing chip to perform super-resolution upsampling processing on low-resolution rendered images, which does not occupy the CPU and GPU computing power of electronic devices, and can free up more computing power to improve game frame rate.

[0104] In some embodiments of this application, the image processing method is performed by an electronic device, which includes a central processing unit, a graphics processing unit, and a neural network processor, the neural network processor being used to generate a fifth image;

[0105] The first and second images are obtained by rendering based on the image rendering parameters, including:

[0106] The first and second images are rendered in the graphics processor based on image rendering parameters; and

[0107] The method also includes:

[0108] The image rendering parameters, the first image, and the second image are sent to the central processing unit via the graphics processor.

[0109] The central processing unit sends the image rendering parameters, the first image, and the second image to the neural network processor.

[0110] In the embodiments of this application, the electronic device includes a central processing unit (CPU), a graphics processing unit (GPU), and a neural network processing unit (NPU). Specifically, it is assumed that the module that performs the super-resolution and image fusion process is defined as a VNSS module, and the operation of the VNSS module is performed on the NPU.

[0111] The game rendering pipeline runs on the GPU. Since there is currently no direct connection between the GPU and the NPU, after the first and second images are rendered on the GPU's rendering pipeline, the first and second images, as well as the image rendering parameters generated during the rendering process, including the image information to be rendered, depth information, and motion vector information, are all sent from the GPU's memory (i.e., video memory) to the CPU's memory. The CPU then copies the above data from the CPU memory to the NPU memory. The NPU inputs the processed data into the network, and through network inference, obtains the network's sampling radius adjustment coefficient and temporal fusion coefficient.

[0112] Afterward, the NPU copies the predicted data from NPU memory to CPU memory, then from CPU memory to GPU memory, and continues to perform image processing steps in the GPU's rendering pipeline.

[0113] This application embodiment achieves data interoperability between the GPU and NPU by having the central processing unit schedule the data interaction between the graphics processing unit and the neural network processor, thereby improving image processing efficiency.

[0114] The image processing method provided in this application can be executed by an image processing device. This application uses an image processing device executing the image processing method as an example to illustrate the image processing device provided in this application.

[0115] In some embodiments of this application, an image processing apparatus is provided. Figure 4 Structural block diagrams of image processing apparatuses according to some embodiments of this application are shown, such as Figure 4 As shown, the image processing apparatus 400 includes:

[0116] The rendering module 402 is used to render a first image and a second image, wherein the first image and the second image are rendered based on the field of view of the virtual camera when the field of view of the virtual camera moves with first motion information, and the resolution of the first image and the resolution of the second image are both the first resolution.

[0117] The determination module 404 is used to determine the sampling parameters and fusion parameters of the first image and the second image;

[0118] The processing module 406 is used to perform upsampling processing on the first image and the second image based on the sampling parameters to obtain the processed third image and the fourth image, wherein the resolution of the third image and the fourth image is the second resolution, which is greater than the first resolution; and to perform fusion processing on the third image and the fourth image according to the fusion parameters to obtain the processed fifth image, wherein the resolution of the fifth image is the second resolution.

[0119] This application embodiment employs camera shaking to render first and second images of the virtual camera at different positions in the rendering pipeline. This allows the first and second images to include different image information, complementing each other. During the upsampling process of super-resolution processing, the first and second images are upsampled separately, and the image details recorded in the upsampled, higher-resolution third and fourth images are fused together. This results in a high-resolution image with more detail, effectively reducing the loss of image detail and image quality degradation.

[0120] In some embodiments of this application, the determining module is further configured to determine image rendering parameters based on the field of view of the virtual camera and the first motion information, wherein the image rendering parameters include image information to be rendered, depth information and motion vector information;

[0121] The rendering module is specifically used to render the first image and the second image based on the image rendering parameters.

[0122] The processing module is also used to process the image rendering parameters to obtain the processed network input data; and to input the network input data into the information processing network to obtain the sampling parameters and fusion parameters.

[0123] This application embodiment performs network preprocessing on the image rendering parameters generated during the image rendering process. By inferring the sampling parameters and fusion parameters through the processing network, better upsampling effect and better fusion effect can be obtained, so that the final high-resolution image contains more picture details and improves image quality.

[0124] In some embodiments of this application, the rendering module is further configured to render a first image and a second image through an image rendering pipeline; and to send the first image, the second image, sampling parameters, and fusion parameters to an image processing pipeline through the image rendering pipeline.

[0125] The processing module is also used to perform upsampling processing on the first and second images through the image processing pipeline; to perform fusion processing on the third and fourth images through the image processing pipeline; and to send the fifth image to the image rendering pipeline through the image processing pipeline.

[0126] This application embodiment sets up an independent image processing pipeline to perform upsampling and fusion processing on the images rendered in the image rendering pipeline. Therefore, it does not occupy the computing power originally used to render the images, thus ensuring the efficiency of image rendering and guaranteeing the game frame rate while ensuring the game visuals.

[0127] In some embodiments of this application, the third image includes a first pixel, and the sampling parameters include a sampling radius;

[0128] The determining module is further configured to determine the pixel position of the first pixel in the fourth image based on the pixel position and motion vector information of the first pixel in the third image; determine a first image region in the third image based on the sampling radius and the pixel position of the first pixel in the third image; and determine a second image region in the fourth image based on the sampling radius and the pixel position of the first pixel in the fourth image, wherein the image content in the second image region is the same as the image content in the first image region.

[0129] The processing module is also used to perform fusion processing on the first image region and the second image region based on the fusion parameters to obtain the fifth image.

[0130] The embodiments of this application can adaptively obtain the sampling radius for different pixels, which can reproduce fine details well. It can perceive the content around the sampling point, such as whether it is an edge, and adjust the sampling radius according to different image content. At the same time, according to the fusion parameters, it can perform better temporal fusion of frame images from different time sequences. The fusion granularity is at the pixel level, thus achieving a more refined fusion effect.

[0131] In some embodiments of this application, the image processing apparatus includes an image processing chip for generating a fifth image;

[0132] The processing module is also used to send image rendering parameters, the first image, and the second image to the image processing chip.

[0133] This application embodiment uses an external independent image processing chip to perform super-resolution upsampling processing on low-resolution rendered images, which does not occupy the CPU and GPU computing power of electronic devices, and can free up more computing power to improve game frame rate.

[0134] In some embodiments of this application, the image processing apparatus includes a central processing unit, a graphics processing unit, and a neural network processor, wherein the neural network processor is used to generate a fifth image;

[0135] The processing module is also used to render a first image and a second image in the graphics processor based on image rendering parameters;

[0136] The device also includes:

[0137] The data interaction module is used to send image rendering parameters, a first image, and a second image to the central processing unit via the graphics processor; and to send image rendering parameters, a first image, and a second image to the neural network processor via the central processing unit.

[0138] This application embodiment achieves data interoperability between the GPU and NPU by having the central processing unit schedule the data interaction between the graphics processing unit and the neural network processor, thereby improving image processing efficiency.

[0139] The image processing device in this application embodiment can be an electronic device or a component within an electronic device, such as an integrated circuit or a chip. The electronic device can be a terminal or other devices besides a terminal. For example, the electronic device can be a mobile phone, tablet computer, laptop computer, PDA, in-vehicle electronic device, mobile internet device (MID), augmented reality (AR) / virtual reality (VR) device, robot, wearable device, ultra-mobile personal computer (UMPC), netbook, or personal digital assistant (PDA), etc. It can also be a server, network attached storage (NAS), personal computer (PC), television set (TV), ATM, or self-service machine, etc. This application embodiment does not specifically limit the device.

[0140] The image processing device in this application embodiment can be a device with an operating system. The operating system can be Android, iOS, or other possible operating systems; this application embodiment does not specifically limit the specific operating system.

[0141] The image processing apparatus provided in this application embodiment can implement the various processes implemented in the above method embodiments, and will not be described again here to avoid repetition.

[0142] Optionally, embodiments of this application also provide an electronic device. Figure 5 A structural block diagram of an electronic device according to an embodiment of this application is shown, such as... Figure 5 As shown, the electronic device 500 includes a processor 502, a memory 504, and a program or instructions stored in the memory 504 and executable on the processor 502. When the program or instructions are executed by the processor 502, they implement the various processes of the above method embodiments and achieve the same technical effects. To avoid repetition, they will not be described again here.

[0143] It should be noted that the electronic devices in the embodiments of this application include the aforementioned mobile electronic devices and non-mobile electronic devices.

[0144] Figure 6 A schematic diagram of the hardware structure of an electronic device to implement an embodiment of this application.

[0145] The electronic device 600 includes, but is not limited to, components such as: radio frequency unit 601, network module 602, audio output unit 603, input unit 604, sensor 605, display unit 606, user input unit 607, interface unit 608, memory 609, and processor 610.

[0146] Those skilled in the art will understand that the electronic device 600 may also include a power supply (such as a battery) for supplying power to various components. The power supply may be logically connected to the processor 610 through a power management system, thereby enabling functions such as managing charging, discharging, and power consumption through the power management system. Figure 6 The electronic device structure shown does not constitute a limitation on the electronic device. The electronic device may include more or fewer components than shown, or combine certain components, or have different component arrangements, which will not be elaborated here.

[0147] The processor 610 is used to render a first image and a second image, wherein the first image and the second image are rendered based on the field of view of the virtual camera when the field of view of the virtual camera moves with first motion information, and the resolution of the first image and the resolution of the second image are both the first resolution; to determine the sampling parameters and fusion parameters of the first image and the second image; to perform upsampling processing on the first image and the second image based on the sampling parameters to obtain a processed third image and a fourth image, wherein the resolution of the third image and the fourth image is the second resolution, and the second resolution is greater than the first resolution; and to perform fusion processing on the third image and the fourth image according to the fusion parameters to obtain a processed fifth image, the resolution of the fifth image being the second resolution.

[0148] This application embodiment employs camera shaking to render first and second images of the virtual camera at different positions in the rendering pipeline. This allows the first and second images to include different image information, complementing each other. During the upsampling process of super-resolution processing, the first and second images are upsampled separately, and the image details recorded in the upsampled, higher-resolution third and fourth images are fused together. This results in a high-resolution image with more detail, effectively reducing the loss of image detail and image quality degradation.

[0149] Optionally, the processor 610 is further configured to determine image rendering parameters based on the field of view of the virtual camera and the first motion information, wherein the image rendering parameters include information about the image to be rendered, depth information, and motion vector information; render a first image and a second image based on the image rendering parameters; perform information processing on the image rendering parameters to obtain processed network input data; and input the network input data into the information processing network to obtain sampling parameters and fusion parameters.

[0150] This application embodiment performs network preprocessing on the image rendering parameters generated during the image rendering process. By inferring the sampling parameters and fusion parameters through the processing network, better upsampling effect and better fusion effect can be obtained, so that the final high-resolution image contains more picture details and improves image quality.

[0151] Optionally, the third image includes the first pixel, and the sampling parameters include the sampling radius;

[0152] The processor 610 is further configured to: determine the pixel position of the first pixel in the fourth image based on the pixel position and motion vector information of the first pixel in the third image; determine a first image region in the third image based on the sampling radius and the pixel position of the first pixel in the third image; determine a second image region in the fourth image based on the sampling radius and the pixel position of the first pixel in the fourth image, wherein the image content in the second image region is the same as the image content in the first image region; and perform fusion processing on the first image region and the second image region based on fusion parameters to obtain a fifth image.

[0153] The embodiments of this application can adaptively obtain the sampling radius for different pixels, which can reproduce fine details well. It can perceive the content around the sampling point, such as whether it is an edge, and adjust the sampling radius according to different image content. At the same time, according to the fusion parameters, it can perform better temporal fusion of frame images from different time sequences. The fusion granularity is at the pixel level, thus achieving a more refined fusion effect.

[0154] Optionally, the processor 610 is also used to send image rendering parameters, the first image, and the second image to the image processing chip.

[0155] This application embodiment uses an external independent image processing chip to perform super-resolution upsampling processing on low-resolution rendered images, which does not occupy the CPU and GPU computing power of electronic devices, and can free up more computing power to improve game frame rate.

[0156] Optionally, the processor 610 is further configured to render a first image and a second image based on image rendering parameters in a graphics processor; send the image rendering parameters, the first image, and the second image to a central processing unit via the graphics processor; and send the image rendering parameters, the first image, and the second image to a neural network processor via the central processing unit.

[0157] This application embodiment achieves data interoperability between the GPU and NPU by having the central processing unit schedule the data interaction between the graphics processing unit and the neural network processor, thereby improving image processing efficiency.

[0158] It should be understood that, in this embodiment, the input unit 604 may include a graphics processing unit (GPU) 6041 and a microphone 6042. The GPU 6041 processes image data of still images or videos obtained by an image capture device (such as a camera) in video capture mode or image capture mode. The display unit 606 may include a display panel 6061, which may be configured in the form of a liquid crystal display, an organic light-emitting diode, or the like. The user input unit 607 includes at least one of a touch panel 6071 and other input devices 6072. The touch panel 6071 is also called a touch screen. The touch panel 6071 may include two parts: a touch detection device and a touch controller. Other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (such as volume control buttons, power buttons, etc.), a trackball, a mouse, and a joystick, which will not be described in detail here.

[0159] The memory 609 can be used to store software programs and various data. The memory 609 may primarily include a first storage area for storing programs or instructions and a second storage area for storing data. The first storage area may store the operating system, application programs or instructions required for at least one function (such as sound playback, image playback, etc.). Furthermore, the memory 609 may include volatile memory or non-volatile memory, or both. The non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. Volatile memory can be random access memory (RAM), static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDRSDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous link dynamic random access memory (SLDRAM), and direct memory bus RAM (DRRAM). The memory 609 in this embodiment includes, but is not limited to, these and any other suitable types of memory.

[0160] Processor 610 may include one or more processing units; optionally, processor 610 integrates an application processor and a modem processor, wherein the application processor mainly handles operations involving the operating system, user interface, and applications, and the modem processor mainly handles wireless communication signals, such as a baseband processor. It is understood that the aforementioned modem processor may also not be integrated into processor 610.

[0161] This application also provides a readable storage medium storing a program or instructions. When the program or instructions are executed by a processor, they implement the various processes of the above method embodiments and achieve the same technical effect. To avoid repetition, they will not be described again here.

[0162] The processor is the processor in the electronic device described in the above embodiments. The readable storage medium includes computer-readable storage media, such as computer read-only memory (ROM), random access memory (RAM), magnetic disk, or optical disk.

[0163] This application also provides a chip, which includes a processor and a communication interface. The communication interface and the processor are coupled. The processor is used to run programs or instructions to implement the various processes of the above method embodiments and achieve the same technical effect. To avoid repetition, it will not be described again here.

[0164] It should be understood that the chip mentioned in the embodiments of this application may also be referred to as a system-on-a-chip, system chip, chip system, or system-on-a-chip, etc.

[0165] This application provides a computer program product, which is stored in a storage medium and executed by at least one processor to implement the various processes of the above method embodiments and achieve the same technical effects. To avoid repetition, it will not be described again here.

[0166] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element. Furthermore, it should be noted that the scope of the methods and apparatuses in the embodiments of this application is not limited to performing functions in the order shown or discussed, but may also include performing functions substantially simultaneously or in the reverse order, depending on the functions involved. For example, the described methods may be performed in a different order than described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

[0167] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a computer software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods of the various embodiments of this application.

[0168] The embodiments of this application have been described above with reference to the accompanying drawings. However, this application is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of this application without departing from the spirit and scope of the claims, and all of these forms are within the protection scope of this application.

Claims

1. An image processing method, characterized in that, The method includes: Render a first image and a second image, wherein the first image and the second image are rendered based on the field of view of the virtual camera when the field of view of the virtual camera moves with first motion information, and the resolution of the first image and the resolution of the second image are both the first resolution. Determine the sampling parameters and fusion parameters of the first image and the second image; Based on the sampling parameters, the first image and the second image are upsampled to obtain the processed third image and the fourth image, wherein the resolution of the third image and the fourth image is a second resolution, which is greater than the first resolution; The third image and the fourth image are fused according to the fusion parameters to obtain a processed fifth image, the resolution of which is the second resolution.

2. The image processing method according to claim 1, characterized in that, The rendering of the first image and the second image includes: Image rendering parameters are determined based on the field of view of the virtual camera and the first motion information, wherein the image rendering parameters include image information to be rendered, depth information, and motion vector information; The first image and the second image are rendered based on the image rendering parameters; The determination of the sampling parameters and fusion parameters of the first image and the second image includes: The image rendering parameters are processed to obtain processed network input data; The network input data is input into the information processing network to obtain the sampling parameters and the fusion parameters.

3. The image processing method according to claim 1, characterized in that, The rendering of the first image and the second image includes: The first image and the second image are rendered using the image rendering pipeline; After determining the sampling parameters and fusion parameters of the first image and the second image, the method further includes: The first image, the second image, the sampling parameters, and the fusion parameters are rendered and sent to the image processing pipeline through the image rendering pipeline. The upsampling process for the first image and the second image based on the sampling parameters includes: The first image and the second image are upsampled using the image processing pipeline. The step of fusing the third image and the fourth image according to the fusion parameters includes: The third image and the fourth image are fused using the image processing pipeline; and The method further includes: The fifth image is sent to the image rendering pipeline via the image processing pipeline.

4. The image processing method according to claim 2, characterized in that, The third image includes a first pixel, and the sampling parameters include a sampling radius; The step of fusing the third image and the fourth image according to the fusion parameters to obtain the processed fifth image includes: The pixel position of the first pixel in the fourth image is determined based on the pixel position of the first pixel in the third image and the motion vector information; Based on the sampling radius and the pixel position of the first pixel in the third image, a first image region is determined in the third image; Based on the sampling radius and the pixel position of the first pixel in the fourth image, a second image region is determined in the fourth image, and the image content in the second image region is the same as the image content in the first image region. The first image region and the second image region are fused based on the fusion parameters to obtain the fifth image.

5. The image processing method according to claim 2, characterized in that, The image processing method is performed by an electronic device, which includes an image processing chip for generating the fifth image. After rendering the first image and the second image based on the image rendering parameters, the method further includes: The image rendering parameters, the first image, and the second image are sent to the image processing chip.

6. The image processing method according to claim 2, characterized in that, The image processing method is executed by an electronic device, which includes a central processing unit, a graphics processing unit, and a neural network processor, the neural network processor being used to generate the fifth image; The process of rendering the first image and the second image based on the image rendering parameters includes: The first image and the second image are rendered in the graphics processor based on the image rendering parameters; and The method further includes: The graphics processor sends the image rendering parameters, the first image, and the second image to the central processing unit. The central processing unit sends the image rendering parameters, the first image, and the second image to the neural network processor.

7. An image processing apparatus, characterized in that, The image processing device includes: A rendering module is used to render a first image and a second image, wherein the first image and the second image are rendered based on the field of view of the virtual camera when the field of view of the virtual camera moves with first motion information, and the resolution of the first image and the resolution of the second image are both the first resolution. The determining module is used to determine the sampling parameters and fusion parameters of the first image and the second image; A processing module is configured to upsample the first image and the second image based on the sampling parameters to obtain processed third and fourth images, wherein the resolution of the third image and the fourth image is a second resolution, which is greater than the first resolution; and The third image and the fourth image are fused according to the fusion parameters to obtain a processed fifth image, the resolution of which is the second resolution.

8. The image processing apparatus according to claim 7, characterized in that, The determining module is further configured to determine image rendering parameters based on the field of view of the virtual camera and the first motion information, wherein the image rendering parameters include image information to be rendered, depth information and motion vector information; The rendering module is specifically used to render the first image and the second image based on the image rendering parameters; The processing module is further configured to process the image rendering parameters to obtain processed network input data; and The network input data is input into the information processing network to obtain the sampling parameters and the fusion parameters.

9. The image processing apparatus according to claim 7, characterized in that, The rendering module is further configured to render the first image and the second image through the image rendering pipeline; and to send the first image, the second image, the sampling parameters and the fusion parameters to the image processing pipeline through the image rendering pipeline. The processing module is further configured to perform upsampling processing on the first image and the second image through the image processing pipeline; The third image and the fourth image are fused using the image processing pipeline. And the fifth image is sent to the image rendering pipeline via the image processing pipeline.

10. The image processing apparatus according to claim 8, characterized in that, The third image includes a first pixel, and the sampling parameters include a sampling radius; The determining module is further configured to determine the pixel position of the first pixel in the fourth image based on the pixel position of the first pixel in the third image and the motion vector information; as well as Based on the sampling radius and the pixel position of the first pixel in the third image, a first image region is determined in the third image; Based on the sampling radius and the pixel position of the first pixel in the fourth image, a second image region is determined in the fourth image, and the image content in the second image region is the same as the image content in the first image region. The processing module is further configured to perform fusion processing on the first image region and the second image region based on the fusion parameters to obtain the fifth image.

11. The image processing apparatus according to claim 8, characterized in that, The image processing device includes an image processing chip, which is used to generate the fifth image; The processing module is further configured to send the image rendering parameters, the first image, and the second image to the image processing chip.

12. The image processing apparatus according to claim 8, characterized in that, The image processing device includes a central processing unit, a graphics processing unit, and a neural network processor, wherein the neural network processor is used to generate the fifth image; The processing module is further configured to render the first image and the second image in the graphics processor based on the image rendering parameters; as well as The device further includes: The data interaction module is used to send the image rendering parameters, the first image, and the second image to the central processing unit via the graphics processor; The central processing unit sends the image rendering parameters, the first image, and the second image to the neural network processor.

13. An electronic device, characterized in that, It includes a processor and a memory, the memory storing a program or instructions that can run on the processor, the program or instructions being executed by the processor to implement the steps of the method as described in any one of claims 1 to 6.

14. A readable storage medium, characterized in that, The readable storage medium stores a program or instructions that, when executed by a processor, implement the steps of the method as described in any one of claims 1 to 6.