A method and device for repairing a damaged document image, a terminal device and a computer readable storage medium
By training a general image restoration model and using spatial pyramids and diffusion sub-models to learn the features and noise distribution of damaged document images, the problem of difficult model maintenance in existing technologies is solved, and unified restoration of different types of damaged document images is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SUN YAT SEN UNIV
- Filing Date
- 2024-05-17
- Publication Date
- 2026-06-16
AI Technical Summary
Existing technologies require the design and training of numerous independent models for different types of damaged document images, leading to difficulties in model maintenance and making it hard to adapt to image restoration under different scenarios and conditions.
By training a general image restoration model, utilizing spatial pyramids and diffusion sub-models, the model learns the features and noise distribution of damaged document images, generates denoised restored document images, calculates the loss function, and adjusts the model parameters to achieve unified restoration of different types of damaged document images.
This approach enables the same model to be adapted to the restoration of different types of damaged document images, avoiding the difficulties of designing and training a large number of independent models, and adapting to image restoration under different scenarios and conditions.
Smart Images

Figure CN118396899B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image processing, and in particular to a method for pyramid decomposition and reconstruction of images, an image denoising method, and an image restoration method. Background Technology
[0002] In recent years, methods based on Convolutional Neural Networks (CNNs) have significantly improved the quality of image inpainting. In CNN-based image inpainting tasks, it is typically necessary to design appropriate network architectures and loss functions based on specific needs and task characteristics to achieve the desired inpainting effect. Different types of document images may have different features and problems, requiring different network architectures and parameter settings to address them. Therefore, it is necessary to design independent models for inpainting different types of damaged document images. For example, some document images may be affected by noise during scanning, while others may have blurring or distortion issues. To handle these different types of problems, the network architecture and parameters need to be appropriately adjusted. This may include adjusting the network depth, kernel size, the number and order of convolutional and pooling layers, and other hyperparameter settings.
[0003] However, images in the real world may contain complex scenes, diverse lighting conditions, and multiple objects. Therefore, designing and training multiple independent models also presents some challenges. It may be necessary to design and train a large number of independent models to cover various scenes and conditions. When faced with new scenes or conditions, it may be necessary to redesign and train new models, which increases the difficulty of model maintenance. Summary of the Invention
[0004] This invention provides a method, apparatus, terminal device, and computer-readable storage medium for repairing damaged document images. By training a general image repair model, it enables image repair of damaged document images in different image scenarios.
[0005] An embodiment of the present invention provides a method for repairing a damaged document image, comprising: acquiring a damaged document image to be processed; inputting the damaged document image to be processed into a trained image repair model, so that the image repair model performs image repair on the damaged document image to be processed, and generating a repaired document image.
[0006] Further, the training of the image restoration model includes the following steps:
[0007] Acquire damaged document images and their corresponding lossless document images under different image scenarios, and group the damaged document images and their corresponding lossless document images into the same group;
[0008] Each group of damaged document images and their corresponding undamaged document images are input into the image restoration model to be trained for iterative training until the loss function converges, resulting in the trained image restoration model. During each training iteration, the built-in encoder encodes the damaged and undamaged document images using a spatial pyramid to obtain multi-scale features of the damaged and corresponding undamaged document images. Then, the built-in diffusion sub-model diffuses these multi-scale features to learn the features and noise distribution of the damaged document images, resulting in denoised restored document images. The denoised restored document images are then compared with the undamaged document images, and the loss function is calculated.
[0009] Furthermore, during each training iteration, the built-in encoder encodes the damaged and undamaged document images using a spatial pyramid, obtaining multi-scale features of the damaged document image and the corresponding multi-scale features of the undamaged document image. Then, the built-in diffusion sub-model diffuses these multi-scale features, learning the features and noise distribution of the damaged document image to obtain a denoised repaired document image. This denoised repaired document image is then compared with the undamaged document image, and the loss function is calculated, specifically:
[0010] The built-in encoder uses a spatial pyramid to encode a set of damaged and undamaged document images, resulting in multi-scale features of the damaged document images and the corresponding multi-scale features of the undamaged document images.
[0011] By using the diffusion sub-model built into the image restoration model to be trained, the multi-scale features of the damaged document image are targeted to perform forward diffusion processing on the multi-scale features of the undamaged document image, and a noise applied to the multi-scale features of the damaged document image is predicted.
[0012] The predicted noise is applied to the multi-scale features of the damaged document image and then backdiffusion is performed to obtain the multi-scale features of the repaired document image after denoising.
[0013] The loss function is calculated by comparing the multi-scale features of the lossless document image with the multi-scale features of the denoised and repaired document image.
[0014] If the loss function does not converge, adjust the network parameters of the current image restoration model based on the current value of the loss function, and reacquire a set of damaged document images and corresponding lossless document images.
[0015] Furthermore, damaged document images and their corresponding lossless document images in different image scenarios include:
[0016] Damaged document images with stains and undamaged document images without stains, damaged document images with blurriness and unblurred document images, and damaged document images with watermarks and undamaged document images without watermarks.
[0017] Furthermore, spatial pyramids are used to encode damaged and lossless document images, specifically as follows:
[0018] Damaged and undamaged document images are downsampled or upsampled multiple times to distinguish image sequences of different resolutions, forming a pyramid-shaped image structure.
[0019] Further forward diffusion processing includes:
[0020] Each layer of features in the lossless document image is forward diffused, and during the forward diffusion process, a denoising process is learned through PyU-Net to predict noise that is applied to the multi-scale features of the damaged document image.
[0021] Furthermore, each layer of features in the lossless document image is forward-divided, and during the forward-dividation process, a denoising process is learned through PyU-Net to predict noise applied to the multi-scale features of the damaged document image, specifically:
[0022] Receive N layers of features from the damaged document image as a feature set. The N-layer features of the lossless document image are used as the feature set.
[0023] The scale level s, the time step t of the model's image diffusion, the high-scale features of the lossless document image, the target feature mapping, and the feature set are input into the neural network that processes forward diffusion for training and learning, and the noise λ applied to the multi-scale features of the damaged document image is predicted.
[0024] Wherein, the target feature mapping is the low-scale feature after the t-th step diffusion of the low-scale features of the lossless document image; This represents the first layer of features in a corrupted document image; This represents the Nth layer feature of a corrupted document image; This represents the first layer of features in a lossless document image; This represents the Nth layer feature of a lossless document image.
[0025] Further, the reverse diffusion process includes:
[0026] For each layer of features in the original damaged document image, the noise λ obtained in the forward diffusion is added to each layer of features, and the features are mapped to obtain the features of the denoised and repaired document image.
[0027] Based on the above method embodiments, the present invention provides corresponding device embodiments, including: a damaged document image acquisition module, an image repair module, and a model training module;
[0028] The damaged document image acquisition module is connected to the image repair module.
[0029] Furthermore, the damaged document image acquisition module is used to acquire the damaged document image to be processed;
[0030] Furthermore, the image restoration module is used to input the damaged document image to be processed into the trained image restoration model, so that the image restoration model can perform image restoration on the damaged document image to be processed and generate a restored document image;
[0031] The trained image restoration model is completed through the model training module;
[0032] Furthermore, the model training module includes:
[0033] This is used to acquire damaged document images and their corresponding lossless document images under different image scenarios, and to group the damaged document images and their corresponding lossless document images into the same group.
[0034] Each group of damaged document images and their corresponding undamaged document images are input into the image restoration model to be trained for iterative training until the loss function converges, resulting in the trained image restoration model. During each training iteration, the built-in encoder encodes the damaged and undamaged document images using a spatial pyramid to obtain multi-scale features of the damaged and corresponding undamaged document images. Then, the built-in diffusion sub-model diffuses these multi-scale features to learn the features and noise distribution of the damaged document images, resulting in denoised restored document images. The denoised restored document images are then compared with the undamaged document images, and the loss function is calculated.
[0035] Based on the above method embodiments, the present invention provides a corresponding terminal device embodiment, including: a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor. When the processor executes the computer program, it implements the steps of the method for repairing damaged document images as described in the present invention.
[0036] Based on the above method embodiments, the present invention provides a corresponding computer-readable storage medium embodiment, including: a stored computer program, which, when the computer program is running, controls the device where the computer-readable storage medium is located to perform the steps of the method for repairing damaged document images as described in the present invention.
[0037] Compared with the prior art, the beneficial effects of this embodiment are as follows:
[0038] This invention constructs a specific model incorporating spatial pyramids and Brownian bridge diffusion. Spatial pyramid processing decomposes an image into image information of different scales, allowing for a unified input for different types of document images. Diffusion processing learns about damaged document images and noise, applying the noise to the diffusion process to obtain the features of the denoised repaired document image. The loss function is calculated by comparing the undamaged document image with the denoised repaired document image, and the model is adjusted, thus completing one training iteration. Finally, through iterative training with multiple sets of damaged document images and corresponding undamaged document images, the model learns the features and noise distribution of different types of damaged document images, generating a general model capable of repairing various types of damaged document images. In other words, this invention, through a unified framework design and iterative training, allows the same model to adapt to the repair of different types of damaged document images, avoiding the need to design and train numerous independent models, and achieving the repair of damaged document images under different scenarios and conditions. Attached Figure Description
[0039] Figure 1 This is a schematic flowchart of a method for repairing damaged document images according to an embodiment of the present invention;
[0040] Figure 2 This is a schematic diagram of the training process of an image restoration model provided in an embodiment of the present invention;
[0041] Figure 3 This is a schematic diagram illustrating the diffusion process in one embodiment of the present invention;
[0042] Figure 4 This is a schematic diagram illustrating the time steps of noise prediction applied to multi-scale features of damaged document images using the PyU-Net method;
[0043] Figure 5 This is a schematic diagram of the structure of a device for repairing damaged document images according to an embodiment of the present invention;
[0044] Figure 6 This is a comparison image of a damaged document image with stains and an undamaged document image without stains in the embodiment.
[0045] Figure 7 This is a comparison image of a damaged document image with blurriness and a lossless document image without blurriness in the embodiment;
[0046] Figure 8 This is a comparison image of a damaged document image with a watermark and a lossless document image without a watermark, as shown in the embodiment. Detailed Implementation
[0047] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0048] See Figure 1 An embodiment of the present invention provides a method for repairing damaged document images, the method comprising at least the following steps:
[0049] Step S1: Obtain the image of the damaged document to be processed;
[0050] Step S2: Input the damaged document image to be processed into the trained image inpainting model so that the image inpainting model can perform image inpainting on the damaged document image and generate the repaired document image.
[0051] Regarding step S1: In this invention, a damaged document image can be a damaged document image with stains, a damaged document image with blurriness, or a damaged document image with a watermark.
[0052] For step S2:
[0053] First, a detailed explanation of the training of the image restoration model will be provided:
[0054] like Figure 2 As shown, the training of the above image restoration model includes the following steps:
[0055] Step S21: Obtain damaged document images and their corresponding lossless document images under different image scenarios, and group the damaged document images and their corresponding lossless document images into the same group.
[0056] For step S21, in a preferred embodiment, the damaged document images and their corresponding undamaged document images under different image scenarios include: damaged document images with stains and undamaged document images without stains, damaged document images with blurriness and undamaged document images without blurriness, and damaged document images with watermarks and undamaged document images without watermarks.
[0057] Specifically, such as Figure 5-7 As shown, in this invention, N sets of images of damaged documents with stains can be obtained. Figure 6 (a) and its corresponding undamaged document image (a) Figure 6 (b) N groups of blurry, damaged document images ( Figure 7(a) and its corresponding unblurred, lossless document image ( Figure 7 (b) N sets of damaged document images with watermarks Figure 8 (a) and its corresponding watermark-free, lossless document image ( Figure 8 (b)), where N takes a value greater than 1;
[0058] This allows us to obtain damaged document images and their corresponding undamaged document images under different image scenarios, and use all the acquired images as training samples to train the image restoration model described above.
[0059] Step S22: Input each group of damaged document images and their corresponding undamaged document images into the image restoration model to be trained for iterative training until the loss function converges, and obtain the trained image restoration model; wherein, during each training, the built-in encoder encodes the damaged document images and undamaged document images using a spatial pyramid to obtain the multi-scale features of the damaged document images and the corresponding multi-scale features of the undamaged document images. Then, the built-in diffusion sub-model diffuses the multi-scale features of the damaged document images and the corresponding multi-scale features of the undamaged document images to learn the features and noise distribution of the damaged document images, and obtains the denoised restored document images. The denoised restored document images are compared with the undamaged document images, and the loss function is calculated.
[0060] For step S22, in a preferred embodiment, during each training iteration, the built-in encoder encodes the damaged document image and the lossless document image using a spatial pyramid to obtain multi-scale features of the damaged document image and the corresponding multi-scale features of the lossless document image. Then, the built-in diffusion sub-model diffuses these multi-scale features to learn the features and noise distribution of the damaged document image, resulting in a denoised repaired document image. The denoised repaired document image is then compared with the lossless document image, and a loss function is calculated, including:
[0061] The built-in encoder uses a spatial pyramid to encode a set of damaged and undamaged document images, resulting in multi-scale features of the damaged document images and the corresponding multi-scale features of the undamaged document images.
[0062] By using the diffusion sub-model built into the image restoration model to be trained, and taking the multi-scale features of the damaged document image as the target, forward diffusion processing is performed on the multi-scale features of the undamaged document image to predict a noise applied to the multi-scale features of the damaged document image.
[0063] The predicted noise is applied to the multi-scale features of the damaged document image for back-diffusion processing to obtain the denoised repaired document image multi-scale features.
[0064] The loss function is calculated by comparing the multi-scale features of the lossless document image with the multi-scale features of the denoised and repaired document image.
[0065] If the loss function does not converge, adjust the network parameters of the current image restoration model based on the current value of the loss function, and reacquire a set of damaged document images and corresponding undamaged document images.
[0066] In a preferred embodiment, encoding damaged and lossless document images using a spatial pyramid includes: downsampling or upsampling the damaged and lossless document images multiple times to distinguish image sequences of different resolutions and form a pyramid-shaped image structure.
[0067] Specific examples Figure 3 As shown, a set of lossless document images I without blurring is obtained. A With blurry, damaged document images I B This set of images is used as input and fed into the image restoration model to be trained.
[0068] The model uses a built-in encoder to perform multiple downsampling or upsampling of the unblurred, lossless document image I through a spatial pyramid. A With blurry, damaged document images I B This process creates a pyramid-shaped image structure to distinguish image sequences of different resolutions. Through encoding, unblurred, lossless multi-scale features (E) of the document image are obtained. A Multi-scale features E of blurred and damaged document images B ;
[0069] A built-in diffusion sub-model is used to perform diffusion processing on the multi-scale features of blurred damaged document images and their corresponding unblurred lossless document images. The diffusion processing includes forward diffusion and backward diffusion, targeting the multi-scale features E of the blurred damaged document image. B For the objective, multi-scale features E of unblurred, lossless document images are analyzed. A Forward diffusion is performed. In this process, a diffusion sub-model is used to predict noise suitable for the multi-scale features of the blurred and damaged document image. Then, the predicted noise is applied to the multi-scale features of the blurred and damaged document image for back diffusion. This step can be understood as diffusing the feature information of the repaired document image from low resolution to high resolution.
[0070] It should be noted that the diffusion sub-model can smooth noise and enhance the overall structure of images more quickly on lower resolution images, and more accurately preserve and enhance the details of images on higher resolution images.
[0071] By learning the features and noise distribution of blurred damaged document images through forward diffusion within the model, and then performing back diffusion processing, the multi-scale features D of the denoised and restored document images are obtained. A After image reconstruction, a denoised and repaired document image can be obtained; the denoised and repaired document image is compared with a lossless document image without blurring, and the loss function is calculated. By adjusting the parameters of the current image repair model, the training of the model is completed.
[0072] Then, input the next set of damaged document images with blurriness and unblurred lossless document images, or damaged document images with stains and unstained lossless document images, or damaged document images with watermarks and unstained lossless document images into the adjusted image restoration model to be trained, and repeat step S22 until the loss function converges to obtain the trained image restoration model.
[0073] In a preferred embodiment, the forward diffusion process includes: forward diffusion of each layer of features of the lossless document image, and learning a denoising process through PyU-Net during the forward diffusion process to predict a noise applied to the multi-scale features of the damaged document image.
[0074] In a preferred embodiment, each layer of features of the lossless document image is forward diffused, and during the forward diffuser process, a denoising process is learned through PyU-Net to predict noise applied to the multi-scale features of the damaged document image, including:
[0075] Receive N layers of features from the damaged document image as a feature set. The N-layer features of the lossless document image are used as the feature set.
[0076] The scale level s, the time step t of the model's image diffusion, the high-scale features of the lossless document image, the target feature mapping, and the feature set are input into the neural network that processes forward diffusion for training and learning, and the noise λ applied to the multi-scale features of the damaged document image is predicted.
[0077] Wherein, the target feature mapping is the low-scale feature after the t-th step diffusion of the low-scale features of the lossless document image; This represents the first layer of features in a corrupted document image; This represents the Nth layer feature of a corrupted document image; This represents the first layer of features in a lossless document image; This represents the Nth layer feature of a lossless document image.
[0078] In a preferred embodiment, the backdiffusion process includes: for each layer of features of the damaged document image, adding the noise λ obtained in the forward diffusion to each layer of features, mapping the features to obtain the features of the denoised repaired document image.
[0079] Specifically, such as Figure 4 As shown, in this invention, the example of feature layer number N=2 is used for illustration. First, the forward diffusion process is explained in detail:
[0080] Forward diffusion processing learns the denoising process through PyU-Net to obtain features from each layer of the lossless document image without blurring. As input, each layer of features corresponds to a blurred or damaged document image. The target output is to simultaneously input the scale level s, the time step t of the model's image diffusion, the high-scale features of the unblurred lossless document image, the low-scale features of the unblurred lossless document image after the t-th step diffusion, and the feature set Z0 as input to the neural network that processes forward diffusion. Forward diffusion is performed at time step t to predict a noise λ that is applied to the multi-scale features of the blurred and damaged document image.
[0081] Through forward diffusion, a noise λ is predicted and applied to the multi-scale features of the blurred and damaged document image. For each feature layer of the blurred and damaged document image, the noise λ obtained in forward diffusion is added to each feature layer. This allows the noise information to be backpropagated to each feature layer. Then, feature mapping is performed on each feature layer after noise correction, and finally, the scale features of the denoised and repaired document image are obtained.
[0082] It should be noted that the model of this invention combines the spatial pyramid and the Brownian bridge diffusion model. The spatial pyramid is used to process the image at multiple scales so that the Brownian bridge diffusion model can be applied at different scales. Applying the Brownian bridge diffusion model at different scales can diffuse the damaged document image, which can effectively handle the noise in the image.
[0083] like Figure 5 As shown, based on the above method embodiments, corresponding apparatus embodiments are provided;
[0084] One embodiment of the present invention provides a device for repairing damaged document images, including: a damaged document image acquisition module, an image repair module, and a model training module;
[0085] The damaged document image acquisition module is used to acquire the damaged document image to be processed;
[0086] The image restoration module is used to input the damaged document image to be processed into the trained image restoration model, so that the image restoration model can perform image restoration on the damaged document image to be processed and generate a restored document image;
[0087] The model training module is used to acquire damaged document images and their corresponding lossless document images under different image scenarios, and to group the damaged document images and their corresponding lossless document images into the same group.
[0088] Each group of damaged document images and their corresponding undamaged document images are input into the image restoration model to be trained for iterative training until the loss function converges, resulting in the trained image restoration model. During each training iteration, the built-in encoder encodes the damaged and undamaged document images using a spatial pyramid to obtain multi-scale features of the damaged document images and the corresponding multi-scale features of the undamaged document images. Then, the built-in diffusion sub-model diffuses the multi-scale features of the damaged and undamaged document images to learn the features and noise distribution of the damaged document images, resulting in denoised restored document images. The denoised restored document images are then compared with the undamaged document images, and the loss function is calculated.
[0089] The model training module is connected to the damaged document image acquisition module, which in turn is connected to the image restoration module. The model training module is the training phase, while the damaged document image acquisition module and the image restoration module are the usage phase.
[0090] It is understood that the above-described device embodiments correspond to the method embodiments of the present invention, and can implement the method for repairing damaged document images provided by any of the above-described method embodiments of the present invention.
[0091] It should be noted that the device embodiments described above are merely illustrative, and some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Furthermore, in the accompanying drawings of the device embodiments provided by this invention, the connection relationships between modules indicate that they have communication connections, which can be specifically implemented as one or more communication buses or signal lines. Those skilled in the art can understand and implement this without any creative effort.
[0092] Based on the above embodiments of the damaged document image repair method, another embodiment of the present invention provides a damaged document image repair terminal device, which includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor. When the processor executes the computer program, it implements the damaged document image repair method of any embodiment of the present invention.
[0093] For example, in this embodiment, the computer program can be divided into one or more modules, which are stored in the memory and executed by the processor to complete the present invention. The one or more module units may be a series of computer program instruction segments capable of performing specific functions, which describe the execution process of the computer program in the damaged document image restoration terminal device.
[0094] The damaged document image restoration terminal device can be a desktop computer, laptop, handheld computer, or cloud server, etc. The image reconstruction terminal device may include, but is not limited to, a processor and a memory.
[0095] The processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor can be a microprocessor or any conventional processor. This processor is the control center of the damaged document image restoration terminal device, connecting all parts of the device via various interfaces and lines.
[0096] Based on the above-described method embodiments, another embodiment is provided: another embodiment of the present invention provides a storage medium including a stored computer program, wherein, when the computer program is running, it controls the device where the storage medium is located to execute the method for repairing damaged document images as described in any of the above-described method embodiments of the present invention.
[0097] The aforementioned storage medium is a computer-readable storage medium. The modules / units integrated into the damaged document image repair device / terminal equipment, if implemented as software functional units and sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the above embodiments of the present invention can also be implemented by a computer program instructing related hardware. The computer program can be stored in a computer-readable storage medium, and when executed by a processor, it can implement the steps of the various method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable files, or certain intermediate forms. The computer-readable medium can include: any entity or device capable of carrying the computer program code, recording media, USB flash drives, portable hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM), random access memory (RAM), electrical carrier signals, telecommunication signals, and software distribution media, etc.
[0098] The above description represents the preferred embodiments of the present invention. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principles of the present invention, and these improvements and modifications are also considered to be within the scope of protection of the present invention.
Claims
1. A method of repairing a damaged document image, characterized by, include: Acquire the image of the damaged document to be processed; The image of the damaged document to be processed is input into the trained image restoration model so that the image restoration model can perform image restoration on the image of the damaged document to be processed and generate a restored document image; The training of the image restoration model includes the following steps: Acquire damaged document images and their corresponding lossless document images under different image scenarios, and group the damaged document images and their corresponding lossless document images into the same group; Each set of damaged document images and their corresponding lossless document images are input into the image restoration model to be trained for iterative training until the loss function converges, resulting in a trained image restoration model. During each training iteration, the built-in encoder uses a spatial pyramid to encode the current set of damaged and lossless document images, obtaining multi-scale features of the damaged and corresponding lossless document images. The built-in diffusion sub-model of the image restoration model targets the multi-scale features of the damaged and lossless document images, performing forward diffusion to predict noise applicable to these features. This predicted noise is then applied to the multi-scale features of the damaged and lossless document images for back-diffusion, resulting in denoised restored multi-scale features. The denoised restored multi-scale features are compared with the lossless and lossless features to calculate the loss function. If the loss function fails to converge, the network parameters of the current image restoration model are adjusted based on the current loss function value, and a new set of damaged and lossless document images is acquired.
2. The method of claim 1, wherein The damaged document images and their corresponding lossless document images under different image scenarios include: Damaged document images with stains and undamaged document images without stains, damaged document images with blurriness and unblurred document images, and damaged document images with watermarks and undamaged document images without watermarks.
3. The method for repairing damaged document images according to claim 1, characterized in that, The method of encoding damaged and lossless document images using spatial pyramids includes: The damaged document image and the lossless document image are downsampled or upsampled multiple times to distinguish image sequences of different resolutions, forming a pyramid-shaped image structure.
4. The method for repairing damaged document images according to claim 1, characterized in that, The forward diffusion process includes: Each layer of features in the lossless document image is forward diffused, and during the forward diffusion process, a denoising process is learned through PyU-Net to predict noise that is applied to the multi-scale features of the damaged document image.
5. The method for repairing damaged document images according to claim 4, characterized in that, Forward diffusion is performed on each layer of features of the lossless document image, and a denoising process is learned through PyU-Net during the forward diffusion process to predict noise applied to the multi-scale features of the damaged document image, including: Receive N layers of features from the corrupted document image as the feature set Z T = { , ..., } and the N-layer features of the lossless document image are used as the feature set Z0 = { , ..., }; The scale level s, the time step t of the model's image diffusion, the high-scale features of the lossless document image, the target feature mapping, and the feature set are input into the neural network that processes forward diffusion for training and learning, and the noise λ applied to the multi-scale features of the damaged document image is predicted. Wherein, the target feature mapping is the low-scale feature after the t-th step diffusion of the low-scale features of the lossless document image; This represents the first layer of features in a corrupted document image; This represents the Nth layer feature of a corrupted document image; This represents the first layer of features in a lossless document image; This represents the Nth layer feature of a lossless document image.
6. The method for repairing damaged document images according to claim 1, characterized in that, The reverse diffusion process includes: For each layer of features in the damaged document image, the noise λ obtained in the forward diffusion process is added to each layer of features, and the features are mapped to obtain the features of the denoised and repaired document image.
7. A device for repairing damaged document images, characterized in that, include: Damaged document image acquisition module, image restoration module, and model training module; The damaged document image acquisition module is used to acquire the damaged document image to be processed; The image restoration module is used to input the damaged document image to be processed into the trained image restoration model, so that the image restoration model can perform image restoration on the damaged document image to be processed and generate a restored document image; The model training module is used to acquire damaged document images and their corresponding lossless document images under different image scenarios, and to group the damaged document images and their corresponding lossless document images into the same group. Each set of damaged document images and their corresponding lossless document images are input into the image restoration model to be trained for iterative training until the loss function converges, resulting in a trained image restoration model. During each training iteration, the built-in encoder uses a spatial pyramid to encode the current set of damaged and lossless document images, obtaining multi-scale features of the damaged and corresponding lossless document images. The built-in diffusion sub-model of the image restoration model targets the multi-scale features of the damaged and lossless document images, performing forward diffusion to predict noise applicable to these features. This predicted noise is then applied to the multi-scale features of the damaged and lossless document images for back-diffusion, resulting in denoised restored multi-scale features. The denoised restored multi-scale features are then compared to the lossless and lossless document images to calculate the loss function. If the loss function fails to converge, the network parameters of the current image restoration model are adjusted based on the current loss function value, and a new set of damaged and lossless document images is acquired. The model training module is connected to the damaged document image acquisition module, which in turn is connected to the image restoration module. The model training module is the training phase, while the damaged document image acquisition module and the image restoration module are the usage phase.
8. A terminal device, characterized in that, The device includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, wherein when the processor executes the computer program, it implements the method for repairing a damaged document image as described in any one of claims 1-6.
9. A computer-readable storage medium, characterized in that, include: A stored computer program, wherein, when the computer program is executed, it controls the device containing the computer-readable storage medium to perform the method for repairing a damaged document image as described in any one of claims 1-6.