Image data enhancement method and device based on adding noise to image

By using the diffusion and de-diffusion processes of the image diffusion model to add and remove noise from the image dataset, the problem of lack of realism in existing image data enhancement technologies is solved, and image datasets that are more in line with real-world scenarios are generated.

CN116385328BActive Publication Date: 2026-06-16BEIJING LONGZHI DIGITAL TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING LONGZHI DIGITAL TECH CO LTD
Filing Date
2023-04-07
Publication Date
2026-06-16

Smart Images

  • Figure CN116385328B_ABST
    Figure CN116385328B_ABST
Patent Text Reader

Abstract

The present disclosure relates to the technical field of image processing, and provides an image data enhancement method and device based on adding noise to an image. The method comprises: obtaining an image data set to be data enhanced; adding noise to a target image in the image data set continuously multiple times by using a diffusion process of an image diffusion model, to obtain a first noise image corresponding to the target image; predicting multiple noises added in the diffusion process by using an inverse diffusion process of the image diffusion model, and sequentially removing the predicted multiple noises from the first noise image to obtain a first denoised image corresponding to the target image after restoration; and generating an image data set after data enhancement by using the target image and the first denoised image. By using the above technical means, the problem that an image obtained by using a traditional data enhancement method lacks authenticity is solved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of image processing technology, and in particular to an image data enhancement method and apparatus based on adding noise to an image. Background Technology

[0002] In computer vision, image data augmentation is a common method used to enrich training datasets and improve the generalization ability of models. Existing image data augmentation methods typically generate new image data by performing a series of affine transformations on the original image. Common affine transformations include random rotation, flipping, and cropping. For example, existing image data augmentation methods randomly select a region from the original image, crop it, randomly rotate, slightly stretch, or flip the cropped image, and then add the transformed image to the training dataset. The main drawback of this method is its lack of realism. Because random transformations cannot accurately reproduce the changes in images in real-world applications, they cannot effectively simulate visual and environmental changes in real-world applications, such as changes in lighting and perspective. Therefore, the generated images are often not realistic (the generated images lack realism, which can be understood as the data-augmented images not conforming to the changes in images in real-world application scenarios).

[0003] In realizing the concept disclosed herein, the inventors discovered at least the following technical problems in the related technologies: the lack of realism in images obtained by traditional data augmentation methods. Summary of the Invention

[0004] In view of this, embodiments of the present disclosure provide an image data enhancement method, apparatus, electronic device, and computer-readable storage medium based on adding noise to an image, to solve the problem that images obtained by traditional data enhancement methods lack realism in the prior art.

[0005] A first aspect of this disclosure provides an image data augmentation method based on adding noise to an image, comprising: acquiring an image dataset to be augmented; continuously adding noise to a target image in the image dataset multiple times using the diffusion process of an image diffusion model to obtain a first noisy image corresponding to the target image; predicting multiple noises added during the diffusion process using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted multiple noises from the first noisy image to obtain a restored first denoised image corresponding to the target image; and generating an augmented image dataset using the target image and the first denoised image.

[0006] A second aspect of this disclosure provides an image data enhancement apparatus based on adding noise to an image, comprising: an acquisition module configured to acquire an image dataset to be enhanced; a diffusion module configured to continuously add noise multiple times to a target image in the image dataset using a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image; an inverse diffusion module configured to predict multiple noises added during the diffusion process using an inverse diffusion process of an image diffusion model, and sequentially remove the predicted multiple noises from the first noise image to obtain a restored first denoised image corresponding to the target image; and an enhancement module configured to generate an enhanced image dataset using the target image and the first denoised image.

[0007] A third aspect of this disclosure provides an electronic device including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the method described above.

[0008] A fourth aspect of this disclosure provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the above-described method.

[0009] The beneficial effects of this disclosure embodiment compared with the prior art are as follows: because this disclosure embodiment obtains an image dataset to be data augmented; uses the diffusion process of an image diffusion model to continuously add noise to the target image in the image dataset multiple times to obtain a first noisy image corresponding to the target image; uses the reverse diffusion process of the image diffusion model to predict multiple noises added during the diffusion process, and sequentially removes the predicted multiple noises from the first noisy image to obtain a restored first denoised image corresponding to the target image; and uses the target image and the first denoised image to generate a data-augmented image dataset, therefore, by adopting the above technical means, the problem of the lack of realism in images obtained by traditional data augmentation methods in the prior art can be solved, thereby making the data-augmented image conform to the changes in images in actual application scenarios. Attached Figure Description

[0010] To more clearly illustrate the technical solutions in the embodiments of this disclosure, the accompanying drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this disclosure. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0011] Figure 1 This is a schematic diagram illustrating an application scenario of an embodiment of this disclosure;

[0012] Figure 2This is a schematic flowchart of an image data enhancement method based on adding noise to an image, provided in an embodiment of this disclosure.

[0013] Figure 3 This is a schematic diagram of the structure of an image data enhancement device based on adding noise to an image, provided in an embodiment of this disclosure;

[0014] Figure 4 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this disclosure. Detailed Implementation

[0015] In the following description, specific details such as particular system architectures and techniques are set forth for illustrative purposes and not for limitation, so as to provide a thorough understanding of the embodiments of this disclosure. However, those skilled in the art will understand that this disclosure may also be implemented in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, apparatuses, circuits, and methods have been omitted so as not to obscure the description of this disclosure with unnecessary detail.

[0016] A method and apparatus for image data enhancement based on adding noise to an image, according to embodiments of the present disclosure, will now be described in detail with reference to the accompanying drawings.

[0017] Figure 1 This is a schematic diagram illustrating an application scenario of an embodiment of this disclosure. The application scenario may include terminal devices 101, 102, and 103, server 104, and network 105.

[0018] Terminal devices 101, 102, and 103 can be hardware or software. When terminal devices 101, 102, and 103 are hardware, they can be various electronic devices with displays that support communication with server 104, including but not limited to smartphones, tablets, laptops, and desktop computers. When terminal devices 101, 102, and 103 are software, they can be installed in the aforementioned electronic devices. Terminal devices 101, 102, and 103 can be implemented as multiple software programs or software modules, or as a single software program or software module; this disclosure does not impose any limitations on this. Furthermore, various applications can be installed on terminal devices 101, 102, and 103, such as data processing applications, instant messaging tools, social platform software, search applications, shopping applications, etc.

[0019] Server 104 can be a server that provides various services, such as a backend server that receives requests sent by terminal devices with which it has established communication connections. This backend server can receive and analyze the requests sent by the terminal devices and generate processing results. Server 104 can be a single server, a server cluster consisting of several servers, or a cloud computing service center. This embodiment of the disclosure does not impose any limitations on these aspects.

[0020] It should be noted that server 104 can be either hardware or software. When server 104 is hardware, it can be various electronic devices that provide various services to terminal devices 101, 102, and 103. When server 104 is software, it can be multiple software programs or software modules that provide various services to terminal devices 101, 102, and 103, or it can be a single software program or software module that provides various services to terminal devices 101, 102, and 103. This disclosure does not limit the scope of the embodiments.

[0021] Network 105 can be a wired network using coaxial cable, twisted pair, and fiber optic connection, or it can be a wireless network that enables interconnection of various communication devices without wiring, such as Bluetooth, Near Field Communication (NFC), Infrared, etc. This disclosure does not limit the scope of the network.

[0022] Users can establish a communication connection with server 104 via network 105 through terminal devices 101, 102, and 103 to receive or send information, etc. It should be noted that the specific types, quantities, and combinations of terminal devices 101, 102, and 103, server 104, and network 105 can be adjusted according to the actual needs of the application scenario, and this disclosure embodiment does not impose any limitations on this.

[0023] Figure 2 This is a schematic flowchart of an image data enhancement method based on adding noise to an image, provided by an embodiment of this disclosure. Figure 2 Image data augmentation methods based on adding noise to images can be derived from... Figure 1 The computer or server, or the software on the computer or server, executes the command. For example... Figure 2 As shown, this image data augmentation method based on adding noise to an image includes:

[0024] S201, Obtain the image dataset to be augmented;

[0025] S202, using the diffusion process of the image diffusion model, noise is added to the target image in the image dataset multiple times to obtain the first noise image corresponding to the target image;

[0026] S203, use the inverse diffusion process of the image diffusion model to predict multiple noises added during the diffusion process, and remove the predicted multiple noises in the first noise image in sequence to obtain the restored first denoised image corresponding to the target image;

[0027] S204, using the target image and the first denoised image to generate a data-enhanced image dataset.

[0028] The image diffusion model has two processes: diffusion and inverse diffusion. Diffusion involves adding noise (calculated) to the target image in the image dataset multiple times to obtain a first noisy image corresponding to the target image. Inverse diffusion involves predicting the multiple noises added during diffusion and sequentially removing these predicted noises from the first noisy image to obtain a restored first denoised image.

[0029] The image dataset contains multiple target images. For ease of understanding, we can first consider each target image as a single image. After a diffusion process, we obtain the first noisy image corresponding to the target image. After a reverse diffusion process, we obtain the first denoised image corresponding to the target image. Following this method, we obtain the first denoised image corresponding to each target image. Then, all the target images and their corresponding first denoised images constitute the data-augmented image dataset.

[0030] The diffusion model is primarily a denoising model in terms of structure. It can be a U-Net structure composed of multiple convolutional and deconvolutional layers. The input and output shapes of the U-Net are identical, used to predict noise at each step. This embodiment utilizes a trained image diffusion model for image data augmentation.

[0031] According to the technical solution provided in this disclosure, an image dataset to be augmented is obtained; noise is continuously added to the target image in the image dataset multiple times using the diffusion process of an image diffusion model to obtain a first noisy image corresponding to the target image; multiple noises added during the diffusion process are predicted using the inverse diffusion process of the image diffusion model, and the predicted noises are sequentially removed from the first noisy image to obtain a restored first denoised image corresponding to the target image; the target image and the first denoised image are used to generate an augmented image dataset. Therefore, by adopting the above technical means, the problem of lack of realism in images obtained by traditional data augmentation methods in the prior art can be solved, thereby making the data-augmented image conform to the changes in images in actual application scenarios.

[0032] The diffusion process of the image diffusion model is used to add noise to the target image in the image dataset multiple times to obtain the first noise image corresponding to the target image. This includes: calculating the target image after each noise addition in the following way: based on the target image after the previous noise addition and the basic noise obtained from the previous sampling, the target image after the current noise addition is calculated by the noise calculation formula. The basic noise obtained from the previous sampling is sampled from Gaussian noise when noise was added in the previous time. Gaussian noise is noise that satisfies a Gaussian distribution.

[0033] The noise calculation formula is:

[0034]

[0035] x t x is the target image after the t-th addition of noise. t-1 It is the target image after adding noise for the (t-1)th time, β t β is a constant. t The value ranges from 0 to 1, ∈ t-1 This is the base noise obtained from the (t-1)th sampling. When t equals 1, x0 is the target image. When the number of times noise is added is N, x... N It is the first noisy image corresponding to the target image.

[0036] x N It is the target image after the Nth addition of noise, which is the first noisy image corresponding to the target image.

[0037] Inverse diffusion is the reverse of diffusion. A trained image diffusion model predicts multiple noises added during diffusion and then sequentially removes these predicted noises from the first noisy image, ultimately obtaining the first denoised image corresponding to the target image. Through training, the image diffusion model learns and stores the correspondence between the noise added during diffusion and the noise predicted during inverse diffusion.

[0038] In one alternative embodiment, the number of first denoised images corresponding to the target image is controlled by utilizing the image diffusion model to process the target image the number of times, thereby controlling the size of the data-enhanced image dataset.

[0039] Each time the image diffusion model is used to process the target image, a first denoised image corresponding to the target image is obtained. When the image diffusion model is used to process the target image multiple times, a large number of first denoised images corresponding to the target image are obtained (because the basic noise obtained by sampling is random, the first denoised image obtained each time the image diffusion model is used to process the target image is different, so the image diffusion model can be used to achieve image data enhancement).

[0040] Before adding noise multiple times to a target image in an image dataset using the diffusion process of an image diffusion model to obtain a first noisy image corresponding to the target image, the method further includes: acquiring a training dataset; adding noise multiple times to a training image in the training dataset using the diffusion process of an image diffusion model to obtain a second noisy image corresponding to the training image; predicting multiple noises added during the diffusion process using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted multiple noises from the second noisy image to obtain a restored second denoised image corresponding to the target image; calculating the total loss between the multiple noises added during the diffusion process and the multiple noises predicted during the inverse diffusion process; and updating the model parameters of the image diffusion model based on the total loss to complete the training of the image diffusion model.

[0041] The training image is the same as the target image, the second noisy image is the same as the first noisy image, and the second denoised image is the same as the first denoised image. The difference in names is only to distinguish the model training and model usage processes.

[0042] Calculating the loss between multiple noises added during diffusion and multiple noises predicted during dediffusion can be done by calculating the similarity between two corresponding noises (e.g., the first noise added corresponds to the predicted first noise added, and the second noise added corresponds to the predicted second noise added).

[0043] Before adding noise to the training images in the training dataset multiple times using the diffusion process of the image diffusion model to obtain the second noise image corresponding to the training image, the method further includes: calculating the training image after each noise addition in the following way: based on the training image after the previous noise addition and the basic noise obtained from the previous sampling, the training image after the current noise addition is calculated using the noise calculation formula, wherein the basic noise obtained from the previous sampling is sampled from Gaussian noise when adding noise in the previous time, and Gaussian noise is noise that satisfies a Gaussian distribution, and the training image after multiple noise additions is the second noise image corresponding to the training image.

[0044] The calculation of the total loss between multiple noises added during diffusion and multiple noises predicted during reverse diffusion includes: calculating the loss between each noise added during diffusion and the corresponding noise predicted during reverse diffusion; and summing all the calculated losses as the total loss.

[0045] Optionally, the loss function can be the mean squared error.

[0046] Optionally, image data enhancement can be performed through the following steps:

[0047] Obtain a training dataset; use the diffusion process of an image diffusion model to continuously add noise to the training images in the training dataset multiple times to obtain a second noisy image corresponding to the training images; use the inverse diffusion process of the image diffusion model to predict multiple noises added during the diffusion process, and sequentially remove the predicted multiple noises from the second noisy image to obtain a restored second denoised image corresponding to the target image; calculate the total loss between the multiple noises added during the diffusion process and the multiple noises predicted during the inverse diffusion process; update the model parameters of the image diffusion model based on the total loss to complete the training of the image diffusion model; obtain an image dataset to be augmented; use the diffusion process of the image diffusion model to continuously add noise to the target image in the image dataset multiple times to obtain a first noisy image corresponding to the target image, and calculate the noise added each time in the following way. The target image after noise addition: Based on the target image after the previous noise addition and the base noise obtained from the previous sampling, the target image after noise addition this time is calculated using the noise calculation formula. The base noise obtained from the previous sampling is sampled from Gaussian noise when noise was added last time. The Gaussian noise is noise that follows a Gaussian distribution. The target image after multiple noise additions is the first noise image. The inverse diffusion process of the image diffusion model is used to predict multiple noises added during the diffusion process, and the predicted multiple noises are removed sequentially from the first noise image to obtain the first denoised image corresponding to the target image. The target image and the first denoised image are used to generate a data-enhanced image dataset. The number of times the target image is processed using the image diffusion model is controlled to control the number of first denoised images corresponding to the target image, and thus the size of the data-enhanced image dataset is controlled.

[0048] All of the above-mentioned optional technical solutions can be combined in any way to form the optional embodiments of this application, and will not be described in detail here.

[0049] The following are embodiments of the apparatus disclosed herein, which can be used to execute embodiments of the method disclosed herein. For details not disclosed in the apparatus embodiments of this disclosure, please refer to the embodiments of the method disclosed herein.

[0050] Figure 3 This is a schematic diagram of an image data enhancement device based on adding noise to an image, provided in an embodiment of this disclosure. Figure 3 As shown, the image data enhancement device based on adding noise to an image includes:

[0051] The acquisition module 301 is configured to acquire the image dataset to be augmented.

[0052] The diffusion module 302 is configured to continuously add noise to the target image in the image dataset multiple times using the diffusion process of the image diffusion model, so as to obtain the first noise image corresponding to the target image;

[0053] The inverse diffusion module 303 is configured to predict multiple noises added during the diffusion process using the inverse diffusion process of the image diffusion model, and to remove the predicted multiple noises sequentially in the first noise image to obtain the restored first denoised image corresponding to the target image.

[0054] Enhancement module 304 is configured to generate an enhanced image dataset using the target image and the first denoised image.

[0055] The image diffusion model has two processes: diffusion and inverse diffusion. Diffusion involves adding noise (calculated) to the target image in the image dataset multiple times to obtain a first noisy image corresponding to the target image. Inverse diffusion involves predicting the multiple noises added during diffusion and sequentially removing these predicted noises from the first noisy image to obtain a restored first denoised image.

[0056] The image dataset contains multiple target images. For ease of understanding, we can first consider each target image as a single image. After a diffusion process, we obtain the first noisy image corresponding to the target image. After a reverse diffusion process, we obtain the first denoised image corresponding to the target image. Following this method, we obtain the first denoised image corresponding to each target image. Then, all the target images and their corresponding first denoised images constitute the data-augmented image dataset.

[0057] This disclosure describes an embodiment of image data augmentation using a trained image diffusion model.

[0058] According to the technical solution provided in this disclosure, an image dataset to be augmented is obtained; noise is continuously added to the target image in the image dataset multiple times using the diffusion process of an image diffusion model to obtain a first noisy image corresponding to the target image; multiple noises added during the diffusion process are predicted using the inverse diffusion process of the image diffusion model, and the predicted noises are sequentially removed from the first noisy image to obtain a restored first denoised image corresponding to the target image; the target image and the first denoised image are used to generate an augmented image dataset. Therefore, by adopting the above technical means, the problem of lack of realism in images obtained by traditional data augmentation methods in the prior art can be solved, thereby making the data-augmented image conform to the changes in images in actual application scenarios.

[0059] Optionally, the diffusion module 302 is also configured to calculate the target image after each noise addition in the following manner: based on the target image after the previous noise addition and the basic noise obtained from the previous sampling, the target image after the current noise addition is calculated by the noise calculation formula, wherein the basic noise obtained from the previous sampling is sampled from Gaussian noise when noise was added in the previous time, and Gaussian noise is noise that satisfies a Gaussian distribution.

[0060] The noise calculation formula is:

[0061]

[0062] x t x is the target image after the t-th addition of noise. t-1 It is the target image after adding noise for the (t-1)th time, β t β is a constant. t The value ranges from 0 to 1, ∈ t-1 This is the base noise obtained from the (t-1)th sampling. When t equals 1, x0 is the target image. When the number of times noise is added is N, x... N It is the first noisy image corresponding to the target image.

[0063] x N It is the target image after the Nth addition of noise, which is the first noisy image corresponding to the target image.

[0064] Inverse diffusion is the reverse of diffusion. A trained image diffusion model predicts multiple noises added during diffusion and then sequentially removes these predicted noises from the first noisy image, ultimately obtaining the first denoised image corresponding to the target image. Through training, the image diffusion model learns and stores the correspondence between the noise added during diffusion and the noise predicted during inverse diffusion.

[0065] Optionally, the enhancement module 304 is also configured to control the number of first denoised images corresponding to the target image by utilizing the image diffusion model to process the target image the number of times, thereby controlling the size of the data-enhanced image dataset.

[0066] Each time the image diffusion model is used to process the target image, a first denoised image corresponding to the target image is obtained. When the image diffusion model is used to process the target image multiple times, a large number of first denoised images corresponding to the target image are obtained (because the basic noise obtained by sampling is random, the first denoised image obtained each time the image diffusion model is used to process the target image is different, so the image diffusion model can be used to achieve image data enhancement).

[0067] Optionally, the diffusion module 302 is further configured to: acquire a training dataset; continuously add noise to the training images in the training dataset multiple times using the diffusion process of the image diffusion model to obtain a second noise image corresponding to the training images; predict multiple noises added during the diffusion process using the inverse diffusion process of the image diffusion model, and sequentially remove the predicted multiple noises from the second noise image to obtain the restored second denoised image corresponding to the target image; calculate the total loss between the multiple noises added during the diffusion process and the multiple noises predicted during the inverse diffusion process; and update the model parameters of the image diffusion model based on the total loss to complete the training of the image diffusion model.

[0068] The training image is the same as the target image, the second noisy image is the same as the first noisy image, and the second denoised image is the same as the first denoised image. The difference in names is only to distinguish the model training and model usage processes.

[0069] Calculating the loss between multiple noises added during diffusion and multiple noises predicted during dediffusion can be done by calculating the similarity between two corresponding noises (e.g., the first noise added corresponds to the predicted first noise added, and the second noise added corresponds to the predicted second noise added).

[0070] Optionally, the diffusion module 302 is further configured to calculate the training image after each noise addition in the following manner: based on the training image after the previous noise addition and the basic noise obtained from the previous sampling, the training image after the current noise addition is calculated using a noise calculation formula, wherein the basic noise obtained from the previous sampling is sampled from Gaussian noise when noise was added in the previous time, and Gaussian noise is noise that satisfies a Gaussian distribution, and the training image after multiple noise additions is the second noise image corresponding to the training image.

[0071] Optionally, the diffusion module 302 is also configured to calculate the loss between each noise added during the diffusion process and the corresponding noise predicted during the reverse diffusion process; and to sum all the calculated losses as the total loss.

[0072] Optionally, the loss function can be the mean squared error.

[0073] It should be understood that the sequence number of each step in the above embodiments does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this disclosure.

[0074] Figure 4 This is a schematic diagram of the electronic device 4 provided in an embodiment of this disclosure. Figure 4As shown, the electronic device 4 of this embodiment includes: a processor 401, a memory 402, and a computer program 403 stored in the memory 402 and executable on the processor 401. When the processor 401 executes the computer program 403, it implements the steps in the various method embodiments described above. Alternatively, when the processor 401 executes the computer program 403, it implements the functions of each module / unit in the various device embodiments described above.

[0075] Electronic device 4 can be a desktop computer, laptop, handheld computer, cloud server, or other electronic device. Electronic device 4 may include, but is not limited to, processor 401 and memory 402. Those skilled in the art will understand that... Figure 4 This is merely an example of electronic device 4 and does not constitute a limitation on electronic device 4. It may include more or fewer components than shown, or different components.

[0076] The processor 401 may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.

[0077] The memory 402 can be an internal storage unit of the electronic device 4, such as a hard disk or RAM of the electronic device 4. The memory 402 can also be an external storage device of the electronic device 4, such as a plug-in hard disk, Smart Media Card (SMC), Secure Digital (SD) card, Flash Card, etc., equipped on the electronic device 4. The memory 402 can also include both internal and external storage units of the electronic device 4. The memory 402 is used to store computer programs and other programs and data required by the electronic device.

[0078] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional units and modules is merely an example. In practical applications, the above functions can be assigned to different functional units and modules as needed, that is, the internal structure of the device can be divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0079] If an integrated module / unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the methods of the above embodiments can also be implemented by a computer program instructing related hardware. The computer program can be stored in a computer-readable storage medium, and when executed by a processor, it can implement the steps of the various method embodiments described above. The computer program may include computer program code, which can be in the form of source code, object code, executable files, or certain intermediate forms. A computer-readable medium may include: any entity or device capable of carrying computer program code, recording media, USB flash drives, portable hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM), random access memory (RAM), electrical carrier signals, telecommunication signals, and software distribution media, etc. It should be noted that the content included in a computer-readable medium may be appropriately added to or subtracted according to the requirements of legislation and patent practice in a jurisdiction. For example, in some jurisdictions, according to legislation and patent practice, computer-readable media may not include electrical carrier signals and telecommunication signals.

[0080] The above embodiments are only used to illustrate the technical solutions of this disclosure, and are not intended to limit it. Although this disclosure has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this disclosure, and should all be included within the protection scope of this disclosure.

Claims

1. An image data enhancement method based on adding noise to an image, characterized in that, include: Obtain the image dataset to be augmented; Noise is added repeatedly to the target image in the image dataset using the diffusion process of the image diffusion model to obtain the first noise image corresponding to the target image; The reverse diffusion process of the image diffusion model is used to predict multiple noises added during the diffusion process, and the predicted noises are removed sequentially from the first noise image to obtain the restored first denoised image corresponding to the target image. The data-enhanced image dataset is generated using the target image and the first denoised image; The diffusion process using an image diffusion model to continuously add noise multiple times to the target image in the image dataset to obtain a first noisy image corresponding to the target image includes: The target image after each addition of noise is calculated using the following method: Based on the target image after the previous addition of noise and the basic noise obtained from the previous sampling, the target image after the current addition of noise is calculated using the noise calculation formula. The basic noise obtained from the previous sampling is sampled from Gaussian noise when noise was added in the previous sampling. The Gaussian noise is noise that satisfies a Gaussian distribution. The noise calculation formula is: It is the target image after adding noise for the tth time. This is the target image after adding noise for the (t-1)th time. It is a constant. The value ranges from 0 to 1. It is the base noise obtained from the (t-1)th sampling. When t equals 1, The target image has N noise additions. It is the first noise image corresponding to the target image; include: The number of times the target image is processed using the image diffusion model is controlled, thereby controlling the number of first denoised images corresponding to the target image and the size of the data-enhanced image dataset. Before the diffusion process using the image diffusion model to continuously add noise multiple times to the target image in the image dataset to obtain the first noisy image corresponding to the target image, the method further includes: Obtain the training dataset; Noise is added repeatedly to the training images in the training dataset using the diffusion process of the image diffusion model to obtain the second noise image corresponding to the training image; The reverse diffusion process of the image diffusion model is used to predict multiple noises added during the diffusion process, and the predicted noises are removed sequentially from the second noise image to obtain the restored second denoised image corresponding to the target image. Calculate the total loss between the multiple noises added during the diffusion process and the multiple noises predicted during the reverse diffusion process; Based on the total loss, the model parameters of the image diffusion model are updated to complete the training of the image diffusion model.

2. The method according to claim 1, characterized in that, Before adding noise multiple times to the training images in the training dataset using the diffusion process of the image diffusion model to obtain the second noisy image corresponding to the training image, the method further includes: The training images after each addition of noise are calculated in the following way: Based on the training image after the previous addition of noise and the base noise obtained from the previous sampling, the training image after the current addition of noise is calculated using the noise calculation formula. The base noise obtained from the previous sampling is sampled from Gaussian noise when noise was added in the previous time. The Gaussian noise is noise that satisfies a Gaussian distribution. The training image after multiple additions of noise is the second noise image corresponding to the training image.

3. The method according to claim 1, characterized in that, The calculation of the total loss between the multiple noises added during the diffusion process and the multiple noises predicted during the reverse diffusion process includes: Calculate the loss between each noise added during the diffusion process and the corresponding noise predicted during the reverse diffusion process; The sum of all the calculated losses is taken as the total loss.

4. An image data enhancement apparatus based on adding noise to an image, the apparatus employing the method according to any one of claims 1-3, characterized in that, The device includes: The acquisition module is configured to acquire the image dataset to be augmented. The diffusion module is configured to continuously add noise to the target image in the image dataset multiple times using the diffusion process of the image diffusion model, to obtain a first noise image corresponding to the target image; The inverse diffusion module is configured to predict multiple noises added during the diffusion process using the inverse diffusion process of the image diffusion model, and to remove the predicted multiple noises sequentially from the first noise image to obtain the restored first denoised image corresponding to the target image. The enhancement module is configured to generate the data-enhanced image dataset using the target image and the first denoised image.

5. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the method as described in any one of claims 1 to 3.

6. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method as described in any one of claims 1 to 3.