Image generation method, device and storage medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By introducing a spatiotemporally important-aware noise scheduling mechanism into the diffusion model, the problem of the lack of semantic interpretability and flexibility of the diffusion model in image generation is solved, and fine control of user participation and image generation process is realized, thereby improving the applicability of the model and image quality.

CN122199720APending Publication Date: 2026-06-12HARBIN INST OF TECH

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: HARBIN INST OF TECH
Filing Date: 2026-02-28
Publication Date: 2026-06-12

Application Information

Patent Timeline

28 Feb 2026

Application

12 Jun 2026

Publication

CN122199720A

IPC: G06T11/60; G06N3/0475; G06N3/045

AI Tagging

Application Domain

Biological models Editing/combining figures or text

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing diffusion models lack semantic interpretability and flexibility in image generation, making it difficult to adapt to downstream tasks such as super-resolution, thus limiting the model's versatility.

⚗Method used

By using a pre-trained diffusion model, combined with a spatial importance matrix and a temporal decay function, the closed-form solution and weighted loss function of the noisy image are determined, enabling spatiotemporal importance-aware noise scheduling, allowing users to intervene and adjust at each iteration.

🎯Benefits of technology

It achieves semantic interpretability and flexibility in the image generation process, enabling users to effectively participate in and guide the generation process, thereby improving the model's versatility and the precision of image generation.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122199720A_ABST

Patent Text Reader

Abstract

The application discloses an image generation method, device and storage medium. The method comprises the following steps: in response to a first input operation, determining a spatial importance matrix and a time decay function according to an input image generation task and a hyperparameter set of a diffusion model; determining a closed-form solution of a noisy image relative to an initial image in a current iteration step according to the spatial importance matrix, the time decay function and a noise variance, so as to determine a target noisy image of a target iteration step; and iteratively generating a target image based on a preset denoising logic according to the target iteration step and the target noisy image. The image generation method provided by the application provides a space-time importance-aware noise scheduling mechanism, so that the diffusion model can realize a differentiated noise addition strategy in different spatial positions and time dimensions according to a user-specified importance distribution, thereby realizing a transition from “uniform noise” to “adaptive noise” and guaranteeing the generality of the overall image generation process.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to an image generation method, an electronic device, and a computer-readable storage medium, belonging to the field of computer image processing technology. Background Technology

[0002] Currently, diffusion models have become a common approach in image generation, synthesizing images by progressively denoising uniform Gaussian noise. A standard diffusion model includes a forward denoising process and a backward denoising process. However, current diffusion models suffer from the following significant drawbacks: firstly, the intermediate results of the inference process lack semantic interpretability, making it difficult for users to effectively participate in and guide the generation process; secondly, the model's generation scale is fixed, making it difficult to flexibly adapt to downstream tasks such as super-resolution, thus limiting the model's versatility. Summary of the Invention

[0003] This application discloses an image generation method, an electronic device, and a computer-readable storage medium.

[0004] The image generation method in this application is based on a pre-trained diffusion model, and the method includes:

[0005] In response to the first input operation, the spatial importance matrix and the temporal decay function are determined based on the input image generation task and the hyperparameter set of the diffusion model. Based on the spatial importance matrix, the temporal decay function, and the noise variance, the closed-form solution of the noisy image relative to the initial image in the current iteration step and the weighted loss function of the diffusion model are determined to determine the target noisy image for the target iteration step, wherein the initial image is associated with the image generation task; Based on the target iteration step and the target noisy image, the target image is iteratively generated according to a preset denoising logic.

[0006] In some implementations, determining the spatial importance matrix and temporal decay function in response to the first input operation, based on the input image generation task and the hyperparameter set of the diffusion model, includes: In response to the first input operation, importance rule parameters for each region in the initial image are determined according to the image generation task to determine the spatial importance matrix, wherein the dimension of the spatial importance matrix is [missing information]. , The height of the initial image is in pixels. The width of the initial image is in pixels; The time-series decay function is determined based on the hyperparameter set of the diffusion model and preset constraints, wherein the time-series decay function is monotonically decreasing.

[0007] In some implementations, determining the closed-form solution of the noisy image relative to the initial image in the current iteration step and the weighted loss function of the diffusion model based on the spatial importance matrix, the temporal decay function, and the noise variance, to determine the target noisy image for the target iteration step, includes... Based on the spatial importance matrix and the temporal decay function, determine the spatiotemporal importance combination function; Based on the spatiotemporal importance combination function and the noise variance, a spatiotemporal adaptive model is determined; Based on the spatiotemporal adaptive model, the closed-form solution and the weighted loss function are determined to determine the target noisy image for the target iteration step.

[0008] In some implementations, the timing decay function is defined as follows:

[0009] in, This represents the time-series decay function. Indicates the ordinal number of the current iteration step, when hour This represents the initial value of the time-series decay function. Indicates the upper limit of the iteration steps. Represents the set of hyperparameters; The set of hyperparameters is formulated as follows:

[0010] in, Represents the x-coordinate of pixels. Represents the pixel ordinate. To replace The subscript of the product, Indicates from 1 to The noise variance of any iteration step between them, This is the spatiotemporal importance associative function; The spatiotemporal importance associative function

[0011] in, Represents the spatial importance matrix, This represents the time-series decay function.

[0012] In some implementations, the spatiotemporal adaptive model is defined as follows:

[0013] in, This represents the spatiotemporal adaptive model. Indicates iteration step The noise variance at that location, For iteration step The initial processed image is obtained by processing the initial image through one or more iterative steps. Represents the identity matrix. This indicates a normal distribution.

[0014] In some implementations, when the initial image is determined, the closed-form solution of the noisy image relative to the initial image in the current iteration step is given by the following formula:

[0015] in For the initial image, Indicates standard Gaussian noise. This indicates element-wise multiplication; The weighted loss function is formulated as follows:

[0016] in This represents the weighted loss function. Indicates the iteration step The initial image and the actual added standard Gaussian noise The joint expectation of three random variables, This indicates the predicted Gaussian noise. This is a stability constant used to ensure numerical stability.

[0017] In some implementations, the step of iteratively generating a target image based on the target iteration step, the target noisy image, and a preset denoising logic includes: Based on the target iteration step and the target noisy image, and using preset denoising logic, the original target image for the current iteration step is generated. In response to the second input operation, the spatial importance matrix is updated based on the original target image and the input region to be corrected to generate the target image.

[0018] In some implementations, updating the spatial importance matrix and generating the target image for the current iteration step in response to the second input operation, based on the original target image and the input region to be corrected, includes: In response to the second input operation, the importance rule parameters corresponding to the region to be corrected in the original target image are reset according to the region to be corrected, so as to update the spatial importance matrix; Perform an iterative step back for the region to be corrected to determine the response iterative step; Based on the response iteration step, the updated spatial importance matrix, and the temporal decay function, the original target image is re-noised to determine the target noisy image for the next iteration step, thereby generating the target image.

[0019] The electronic device in this application includes a memory and a processor. The memory stores a computer program, and when the computer program is executed by the processor, the image generation method described above is implemented.

[0020] The computer-readable storage medium in the embodiments of this application stores a computer program that, when executed by one or more processors, implements the image generation method described above.

[0021] The beneficial effects of this application are as follows: The image generation method provided in this application offers a spatiotemporally important noise addition scheduling mechanism, enabling the diffusion model to implement differentiated noise addition strategies based on the importance distribution specified by the user at different spatial locations and temporal dimensions. This achieves a transformation from "uniform noise" to "adaptive noise," ensuring the versatility of the overall image generation process. Furthermore, in conjunction with user intervention at each iteration step, the method allows users to preview, interrupt, and adjust in real time during the noise reduction process, thereby enabling effective user participation and guidance in the image generation process. Attached Figure Description

[0022] Figure 1 This is one of the flowcharts illustrating the image generation method in the embodiments of this application; Figure 2 This is the second flowchart illustrating the image generation method in the embodiments of this application; Figure 3 This is the third flowchart illustrating the image generation method in the embodiments of this application; Figure 4 This is the fourth flowchart illustrating the image generation method in the embodiments of this application; Figure 5 This is the fifth flowchart illustrating the image generation method in the embodiments of this application. Detailed Implementation

[0023] Please see Figure 1 The image generation method in this application is based on a pre-trained diffusion model. The method includes the following steps: Step 01: In response to the first input operation, determine the spatial importance matrix and temporal decay function based on the input image generation task and the hyperparameter set of the diffusion model; Step 02: Based on the spatial importance matrix, temporal decay function, and noise variance, determine the closed-form solution of the noisy image relative to the initial image in the current iteration step, as well as the weighted loss function of the diffusion model, to determine the target noisy image for the target iteration step. The initial image is associated with the image generation task; Step 03: Based on the target iteration step and the target noisy image, generate the target image iteratively according to the preset denoising logic.

[0024] Specifically, the image generation method in this application is based on a preset diffusion model. The images targeted by the diffusion model generally have a uniform target resolution. The image generation task is generally associated with a preset image dataset. For example, image generation tasks for conceptual design are associated with the LAION-5B dataset, image generation tasks for cultural relic restoration are associated with the Places365 dataset, and image generation tasks for image expansion are associated with the COCO dataset and the Landscape dataset, etc.

[0025] The main execution of an image generation method generally includes two parts: adding noise to the initial image to generate a noisy image, and then performing noise reduction on the noisy image. After the image is denoised, the target image is generated. The noise addition process is a forward process, that is, the iteration steps start from 0 and continue until the preset upper limit of the iteration steps is reached. The noise reduction process is a backward process, meaning the iteration steps start from a preset upper limit of iteration steps. Start iterating until the number of iterations decreases to 0, the aforementioned upper limit of the number of iterations. During the initialization of the diffusion model, the value is typically set to 20-100. In addition, the initialization process also initializes hyperparameters such as the noise variance of the diffusion model, the iteration step back-off amount when the user intervenes in the noise reduction process, the importance weight range, and the stability constant. The importance weight range is generally set to [0,1].

[0026] For the image noise addition process, the core objective is to use the spatiotemporal importance function of spatial differentiation combined with time decay to provide a decision basis for the noise addition process. Generally, noise addition is performed on the initial image based on the image generation task information input by the user through the first input operation interaction. The initial image is directly associated with the image generation task. Through the information input by the first input operation, the diffusion model can directly extract the corresponding initial image from the corresponding image dataset, or directly obtain the initial image based on the input information.

[0027] In some implementations, please refer to Figure 2 Step 01 includes: Step 011: In response to the first input operation, determine the importance rule parameters of each region in the initial image according to the image generation task, so as to determine the spatial importance matrix. The dimension of the spatial importance matrix is... , The height in pixels of the initial image. The width of the initial image in pixels; Step 012: Determine the time-series decay function based on the hyperparameter set of the diffusion model and the preset constraints, wherein the time-series decay function is monotonically decreasing.

[0028] Specifically, for the noise addition process, the image generation method in this application provides the following approach: a spatiotemporal importance model obtained by combining the spatial importance matrix and the temporal decay function is used to perform fast computation, so that the noise-added image at any iteration step can be obtained without having to calculate from the initial iteration step to the upper limit of the iteration step.

[0029] The spatial importance matrix is a matrix of the same size as the initial image, marking the relative importance of different regions in the image. Each pixel corresponds to an importance weight value, which is generally a number not less than 0 and not greater than 1. For example, in an interactive image generation task, the importance weight of each pixel in the main region of interest to the user can be set to 1, the importance weight of each pixel in the background region can be set to 0, and the importance weight of each pixel in the boundary region between the main and background regions can be set to a number between 0 and 1, depending on the situation. Similarly, in an image inpainting generation task, the importance weight of each pixel in the mask region is set to 1, while the importance weight of each pixel in the known context region is set to 0, and so on. The importance weights form the spatial importance matrix according to the pixel arrangement order, thus the dimension of the spatial importance matrix is... , The height in pixels of the initial image. The width of the initial image in pixels.

[0030] The temporal decay function primarily controls how the global noise level of the image decreases as the iteration step increases. Generally, the temporal decay function is a monotonically decreasing function with a range of [0,1], meaning its initial value is 1, and as the iteration step... t Take the upper limit of iteration steps When the function value is 0, its monotonically decreasing characteristic is synchronized with the decay of the standard deviation when the diffusion model adds noise to the image. These characteristics are primarily controlled by the hyperparameter set of the diffusion model. For the specific formula of the temporal decay function and the iterative relationship between the spatial importance matrix and the temporal decay function, please refer to the following example.

[0031] In some implementations, please refer to Figure 3 Step 02 includes: Step 021: Determine the spatiotemporal importance combination function based on the spatial importance matrix and the temporal decay function; Step 022: Determine the spatiotemporal adaptive model based on the spatiotemporal importance combination function and noise variance; Step 023: Based on the spatiotemporal adaptive model, determine the closed-form solution and the weighted loss function to determine the target noisy image for the target iteration step.

[0032] Specifically, based on the determined spatial importance matrix and temporal decay function, the two are combined to form a spatiotemporal importance combination function. The spatiotemporal importance combination function is the core of the spatiotemporal importance-aware noise scheduling mechanism, reconstructing the image generation task into a spatiotemporally dependent dynamic optimization problem.

[0033] Specifically, the time decay function is shown in Equation 1: ... Formula 1 in, Represents the time-series decay function. Indicates the ordinal number of the current iteration step, when hour This represents the initial value of the time-series decay function. Indicates the upper limit of the iteration steps. This represents the set of hyperparameters.

[0034] Furthermore, the hyperparameter set As shown in Formula 2: ... Formula 2 in, Represents the x-coordinate of pixels. Represents the pixel ordinate. To replace The subscript of the product, Indicates from 1 to The noise variance of any iteration step between them. It is a spatiotemporal importance associative function.

[0035] The spatiotemporal importance association function is shown in Equation 3: ... Formula 3 in, Represents the spatial importance matrix. This represents the time-series decay function.

[0036] This results in a spatiotemporal importance association function. Spatially influenced by the spatial importance matrix Modulation, in time, is subject to a time-series decay function. control.

[0037] Next, based on the aforementioned spatiotemporal importance combination function and the noise variance of the diffusion model itself, the forward noise addition process is redefined based on spatiotemporal adaptation. Currently, in related technologies, the forward noise addition process of the diffusion model is shown in Equation 4: ... Formula 4 in Represents a spatiotemporal adaptive model. Indicates iteration step The noise variance at that location, For iteration step The initial processed image is obtained by processing the initial image through one or more iterations. Represents the identity matrix. This represents a normal distribution.

[0038] By redefining Equation 4 based on the aforementioned spatiotemporal importance combination function, we can obtain the spatiotemporal adaptive model shown in Equation 5: ... Formula 5 Based on the aforementioned spatiotemporal adaptive model, the closed-form solution for the noisy image relative to the initial image in the current iteration step is derived. For example, given a fixed initial image, assuming the change in the spatiotemporal importance function is sufficiently slow (i.e., the derivative of its corresponding smoothing fitting function is less than a preset derivative threshold), the iteration step can be obtained through the closed-form solution derivation. Initial image processing The approximate closed-form solution is shown in Equation 6.

[0039] ... Formula 6 in As the initial image, Indicates standard Gaussian noise. This indicates element-wise multiplication.

[0040] In this way, the approximate closed-form solution result, as shown in Equation 6, can be obtained without iterating from the initial state to the upper limit of the iteration steps. In the case of obtaining arbitrary iteration steps Initial image processing Finally, the target noisy image used in the denoising process can be obtained. This not only improves the efficiency of the image noisying process, but also enables noise scheduling with spatiotemporal adaptation as the core, thereby ensuring the versatility of the overall image generation process.

[0041] Based on the above noise reduction process, and exemplarily, a weighted loss function can be further defined based on the spatiotemporal adaptive model shown in Equation 5 and the closed-form solution shown in Equation 6. This weighted loss function is mainly used to optimize the diffusion model. Its main function is to assign higher loss weights to regions in the image with higher spatial importance weights (i.e., spatial importance weights close to 1), thereby enabling the diffusion model to preferentially learn the denoising rules corresponding to regions with higher spatial importance weights. The weighted loss function is shown in Equation 7: ... Formula 7 in This represents the weighted loss function. Indicates the iteration step The initial image and the actual added standard Gaussian noise The joint expectation of three random variables, This indicates the predicted Gaussian noise. This is a stability constant used to ensure numerical stability. The standard Gaussian noise in Equation 7... This is real noise, while This is the predicted Gaussian noise obtained from the diffusion model.

[0042] In some implementations, please refer to Figure 4 Step 03 specifically includes: Step 031: Based on the target iteration step and the target noisy image, generate the original target image for the current iteration step according to the preset denoising logic; Step 032: In response to the second input operation, update the spatial importance matrix based on the original target image and the input region to be corrected to generate the target image.

[0043] Specifically, based on the above implementation method, after the image noise addition process is completed, a preset denoising logic that supports dynamic interactive intervention by the user at each iteration step is further utilized. That is, the target noisy image generated during the noise addition process can be used as a reference, and denoising can be gradually performed through multiple reverse iterations to ultimately generate the target image. For example, let the target iteration step of the noise addition process be the upper limit of the iteration steps. The goal of the denoising process is to reduce the noise from the upper limit of the iteration steps. Multiple iterations are performed in the reverse direction, with the iteration step gradually decreasing, until the image denoising is completed and the target image is generated.

[0044] The core mechanism of the aforementioned denoising process lies in providing the user with an image preview after each iteration. Users can interactively mark areas in the image where the processing results are unsatisfactory to intervene in the denoising process. After the user marks the areas to be corrected, only the spatial importance weights corresponding to the pixels within those areas are adjusted to update the spatial importance matrix. Based on the updated spatial importance weights, iterative step backscaling and partial noise re-injection are performed on the areas to be corrected and the transition areas between them to and from the surrounding areas to readjust the denoising process. The spatial importance weights and noise levels for other areas besides the areas to be corrected and the transition areas remain unchanged. The execution intensity of the denoising process and the injection intensity during noise re-injection are both determined by the spatiotemporal importance combination function defined during the noise addition process. This allows for control, ensuring the precision of noise denoising and re-injection. In this way, after each iteration, the user has the opportunity to intervene and adjust the current iteration, enabling effective participation and guidance of the image generation process while maintaining its precision and stability.

[0045] Specifically, the denoising process will be explained in detail below. Assume the current iteration step is... The target noisy image at the start of the current iteration step is First, the diffusion model is invoked, based on the preset image denoising logic, combined with predicted Gaussian noise. The iteration steps are calculated. The original target image after denoising. Among them, the target image is noise-added. The denoising intensity during denoising is determined by the aforementioned spatiotemporal importance combination function. Control, for target noisy images Pixels with spatiotemporal importance combining function values closer to 1 (i.e., more "important" regions) undergo less denoising to preserve image adjustment space, while the target image is noisier. Pixels whose spatiotemporal importance combination function values are closer to 0 (i.e., relatively unimportant regions) are subjected to higher denoising intensity in order to achieve image sharpening ahead of time.

[0046] Next, the iteration steps will be... The original target image after denoising. The image is presented to the user through a visual interface (such as zoom in or partial view), making it easier for the user to judge the original target image. Which areas need intervention and modification to provide a data foundation for further interactive input?

[0047] Furthermore, in some implementations, please refer to Figure 5 Step 032 specifically includes: Step 0321: In response to the second input operation, based on the region to be corrected, reset the importance rule parameters corresponding to the region to be corrected in the original target image to update the spatial importance matrix; Step 0322: Perform an iteration step back for the region to be corrected to determine the response iteration step; Step 0323: Based on the response iteration step, the updated spatial importance matrix, and the temporal decay function, re-noise the original target image to determine the target noisy image for the next iteration step, thereby generating the target image.

[0048] Specifically, based on the above implementation method, when the user marks the original target image through interactive input (corresponding to the second input operation). When modifying a region that requires intervention, the spatial importance weight of all pixels in the region to be modified is first reset to 1, while the spatial importance weight of each pixel in the transition region at the edge of the region to be modified is reset to a value ranging from 0.3 to 0.5, thereby updating the original target image. The overall spatial importance matrix aims to increase the importance weight of the region to be modified, ensuring that subsequent re-noising and re-denoising processes focus on this region. This guarantees the optimization effect of the re-noising and re-denoising processes in each iteration and avoids noise processing errors. Adjusting the importance weight of the transition region primarily aims to avoid a "seam" between the region to be modified and its surrounding areas. After resetting, the higher the importance weight of pixels in the transition region, the more natural the transition between the region to be modified and its surrounding areas. As can be seen from the above implementation method, when the original target image... When updating the spatial importance matrix, the spatiotemporal importance combination function... It will also be updated simultaneously.

[0049] Next, we will begin working on the original target image. To re-add noise, first set new iteration step information based on the iteration step back-off amount determined during the diffusion model initialization process. Specifically, as shown in Formula 8.

[0050] ... Formula 8 in This refers to the iteration step backoff amount mentioned above. The purpose of setting the iteration step backoff amount is to avoid excessive iteration step backoff, which could lead to efficiency loss in the image processing process. For example, when the upper limit of the iteration step is in the range of 50 to 100, the above iteration step backoff amount can be set to a number in the range of 5 to 15.

[0051] In conjunction with the aforementioned iterative step backoff, the spatiotemporal importance combination function is updated based on the update of the spatial importance matrix. The new iteration step information Substitute them in, and then combine the functions according to the updated spatiotemporal importance. For the original target image Perform noise injection; the injection intensity of the noise is... , To be related to the iteration step The corresponding noise variance is used to obtain the original target image after re-injecting noise. ,like A value of 0 indicates the end of the noise reduction process, and the original target image is returned to its original state. The final target image can be output as the target image. If it is not 0, then start from the iteration step. Beginning, with The noise reduction process continues iteratively on the target image.

[0052] Specifically, if the user determines the original target image If there is no area to be modified in the original target image, then... That is, the target noisy image in the next iteration step. The denoising process is repeated iteratively. The iteration stops when the number of iterations is 0. If it is 0, the denoising process ends and the most recently obtained original target image is used as the final target image. If it is not 0, the denoising process is repeated until the final target image is obtained. At this point, the image generation method is complete.

[0053] The image generation method provided in this application offers a spatiotemporally important noise addition scheduling mechanism. This enables the diffusion model to implement differentiated noise addition strategies based on the user-specified importance distribution across different spatial locations and temporal dimensions, thereby achieving a transformation from "uniform noise" to "adaptive noise" and ensuring the versatility of the overall image generation process. Furthermore, in conjunction with user intervention at each iteration step, the method allows users to preview, interrupt, and adjust the noise reduction process in real time, thus enabling effective user participation and guidance of the image generation process.

[0054] The electronic device in this application includes a memory and a processor. The memory stores a computer program, and when the computer program is executed by the processor, the image generation method described above is implemented.

[0055] The computer-readable storage medium in the embodiments of this application stores a computer program that, when executed by one or more processors, implements the image generation method described above.

[0056] The above description is merely a preferred embodiment of this application and is not intended to limit this application in any way. Although this application has disclosed the preferred embodiment as above, it is not intended to limit this application. Any person skilled in the art can make some modifications or alterations to the above-disclosed technical content to create equivalent embodiments without departing from the scope of the technical solution of this application. Any simple modifications, equivalent substitutions, and improvements made to the above embodiments without departing from the technical solution of this application, based on the technical essence of this application and within the spirit and principles of this application, shall still fall within the protection scope of the technical solution of this application.

Claims

1. An image generation method, characterized in that, The method is based on a pre-trained diffusion model and includes: In response to the first input operation, the spatial importance matrix and the temporal decay function are determined based on the input image generation task and the hyperparameter set of the diffusion model. Based on the spatial importance matrix, the temporal decay function, and the noise variance, the closed-form solution of the noisy image relative to the initial image in the current iteration step and the weighted loss function of the diffusion model are determined to determine the target noisy image for the target iteration step, wherein the initial image is associated with the image generation task; Based on the target iteration step and the target noisy image, the target image is iteratively generated according to a preset denoising logic.

2. The method according to claim 1, characterized in that, In response to the first input operation, determining the spatial importance matrix and temporal decay function based on the input image generation task and the hyperparameter set of the diffusion model includes: In response to the first input operation, importance rule parameters for each region in the initial image are determined according to the image generation task to determine the spatial importance matrix, wherein the dimension of the spatial importance matrix is [missing information]. , The height of the initial image is in pixels. The width of the initial image is in pixels; The time-series decay function is determined based on the hyperparameter set of the diffusion model and preset constraints, wherein the time-series decay function is monotonically decreasing.

3. The method according to claim 2, characterized in that, The step involves determining the closed-form solution of the noisy image relative to the initial image in the current iteration step, and the weighted loss function of the diffusion model, based on the spatial importance matrix, the temporal decay function, and the noise variance, to determine the target noisy image for the target iteration step. Based on the spatial importance matrix and the temporal decay function, determine the spatiotemporal importance combination function; Based on the spatiotemporal importance combination function and the noise variance, a spatiotemporal adaptive model is determined; Based on the spatiotemporal adaptive model, the closed-form solution and the weighted loss function are determined to determine the target noisy image for the target iteration step.

4. The method according to claim 3, characterized in that, The time-series decay function is formulated as follows: in, This represents the time-series decay function. Indicates the ordinal number of the current iteration step, when hour This represents the initial value of the time-series decay function. Indicates the upper limit of the iteration steps. This represents the set of hyperparameters; The set of hyperparameters is formulated as follows: in, Represents the x-coordinate of pixels. Represents the pixel ordinate. To replace The subscript of the product, Indicates from 1 to The noise variance of any iteration step between them, This is the spatiotemporal importance associative function; The spatiotemporal importance combination function in, Represents the spatial importance matrix, This represents the time-series decay function.

5. The method according to claim 4, characterized in that, The spatiotemporal adaptive model is defined by the following formula: in, This represents the spatiotemporal adaptive model. Indicates iteration step The noise variance at that location, For iteration step The initial processed image is obtained by processing the initial image through one or more iterative steps. Represents the identity matrix. This indicates a normal distribution.

6. The method according to claim 5, characterized in that, When the initial image is determined, the closed-form solution of the noisy image relative to the initial image in the current iteration step is given by the following formula: in For the initial image, Indicates standard Gaussian noise. This indicates element-wise multiplication; The weighted loss function is formulated as follows: in This represents the weighted loss function. Indicates the iteration step The initial image and the actual added standard Gaussian noise The joint expectation of three random variables, This indicates the predicted Gaussian noise. This is a stability constant used to ensure numerical stability.

7. The method according to any one of claims 1-6, characterized in that, The step of iteratively generating a target image based on the target iteration step, the target noisy image, and a preset denoising logic includes: Based on the target iteration step and the target noisy image, and using preset denoising logic, the original target image for the current iteration step is generated. In response to the second input operation, the spatial importance matrix is updated based on the original target image and the input region to be corrected to generate the target image.

8. The method according to claim 7, characterized in that, The step of responding to the second input operation by updating the spatial importance matrix and generating the target image for the current iteration step based on the original target image and the input region to be corrected includes: In response to the second input operation, the importance rule parameters corresponding to the region to be corrected in the original target image are reset according to the region to be corrected, so as to update the spatial importance matrix; Perform an iterative step back for the region to be corrected to determine the response iterative step; Based on the response iteration step, the updated spatial importance matrix, and the temporal decay function, the original target image is re-noised to determine the target noisy image for the next iteration step, thereby generating the target image.

9. An electronic device, characterized in that, The electronic device includes a memory and a processor, the memory storing a computer program that, when executed by the processor, implements the image generation method as described in any one of claims 1-8.

10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by one or more processors, implements the image generation method as described in any one of claims 1-8.