A single image super-resolution reconstruction method based on shallow channel separation and aggregation

CN115953294BActive Publication Date: 2026-06-26XIANGTAN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
XIANGTAN UNIV
Filing Date
2022-11-22
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing deep learning-based single-image super-resolution reconstruction methods are difficult to apply to real-world scenarios due to limited computing power and insufficient memory. Furthermore, the large number of model parameters and computational demands result in performance limitations.

Method used

A shallow channel separation and aggregation method is adopted. By constructing an image super-resolution reconstruction network with a shallow channel separation and aggregation module, a nonlinear global feature aggregation module, and an upsampling module, feature information is extracted and aggregated, reducing the number of parameters and improving computational efficiency.

Benefits of technology

With limited parameters and computational resources, experimental results comparable to those of large-scale networks were achieved, reaching an optimized balance in performance and recovering more fine-grained information.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115953294B_ABST
    Figure CN115953294B_ABST
Patent Text Reader

Abstract

The application discloses a single-image super-resolution reconstruction method based on shallow channel separation and aggregation, and belongs to the field of super-resolution image reconstruction. First, data is preprocessed to obtain a high-resolution picture and a low-resolution picture; then, a single-image super-resolution reconstruction network based on shallow channel separation and aggregation is constructed and trained, including a channel separation and aggregation module and a global feature aggregation module; the channel separation and aggregation module is used for feature extraction of shallow picture information, and can better obtain detail and texture information of the picture; and the global feature aggregation module is used for aggregating feature information obtained by the channel separation and aggregation module, and better global information is obtained. The method is beneficial to learning the relationship between deep and shallow feature modes, thereby recovering more fine-grained information, and compared with large and heavy networks, a better balance between computing resources and performance is achieved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of image processing technology, and relates to a single image super-resolution reconstruction method based on deep learning, and particularly to a single image super-resolution reconstruction method based on channel separation and aggregation. Background Technology

[0002] Visual information possesses intuitive and efficient descriptive capabilities, playing a vital role in human society. Images contain a wealth of visual information, allowing people to glean relevant information about the described object, making them crucial information carriers. Generally, higher image resolution implies more detail, and in many fields, such as medical imaging and video surveillance, detail plays a critical role. However, due to the influence of hardware, natural environment, human factors, and other influences, images acquired by imaging systems often suffer from low resolution and blurriness, failing to meet the demands for high-quality images. Image super-resolution reconstruction technology can reconstruct clearer, higher-resolution, and visually better images from acquired low-quality images, thereby improving image resolution and restoring details. For the past two decades, image super-resolution reconstruction has been a research hotspot in image processing, computer vision, and machine learning, attracting widespread attention from industry and academia.

[0003] Traditional super-resolution reconstruction algorithms mainly include interpolation-based super-resolution reconstruction, degradation model-based super-resolution reconstruction, and learning-based super-resolution reconstruction algorithms. Interpolation-based super-resolution reconstruction uses basis functions or interpolation kernels to approximate the lost high-frequency information of the image, thereby achieving super-resolution reconstruction. Common interpolation-based methods include nearest neighbor interpolation, bilinear interpolation, and bicubic interpolation. Degradation model-based super-resolution reconstruction assumes that a high-resolution image has undergone appropriate motion changes, blurring, and noise to obtain a low-resolution image. It extracts key information from the low-resolution image and combines it with prior knowledge of the unknown super-resolution image to constrain the generation of the super-resolution image. Common methods include iterative backprojection, convex set projection, and maximum a posteriori probability. Learning-based super-resolution reconstruction utilizes a large amount of training data to learn a certain correspondence between low-resolution and high-resolution images. Based on the mapping, it predicts the high-resolution image corresponding to the low-resolution image, thus achieving the image super-resolution reconstruction process. Common learning-based methods include manifold learning and sparse coding methods.

[0004] In recent years, traditional super-resolution reconstruction methods have proven unsuitable for high-magnification super-resolution reconstruction. Deep learning-based super-resolution reconstruction algorithms have become the mainstream research direction, with typical network structures including SRCNN, ESPCN, VDSR, DRCN, and EDSR. Dong's SRCNN was the first to use convolutional neural networks (CNNs) in single-image super-resolution. Compared to traditional algorithms, SRCNN adds detailed features, significantly improving visual quality. ESPCN proposes a novel upsampling method, achieving image magnification through channel expansion and pixel rearrangement. VDSR uses 20 convolutional layers to extract features from the input image and introduces residual learning to ensure long-range preservation of detailed features. Another deep recurrent convolutional network, DRCN, employs supervised recurrent layers in each layer, continuously iterating through recurrent layers to acquire high-frequency information and using skip connections to mitigate gradient vanishing. EDSR modifies the residual blocks by removing the normalization layers, reducing network parameters by 40% and increasing the 2× performance by 0.48 dB. The aforementioned deep learning-based network structure, with an increasing number of layers, can not only extract multi-level features from data but also perform end-to-end joint optimization and reconstruction, thus possessing stronger representation capabilities. While most current super-resolution networks achieve relatively good results, they are hampered by numerous model parameters and high computational costs. Under conditions of limited computing power and insufficient memory, super-resolution reconstruction based on deep learning methods is difficult to apply to real-world scenarios. Summary of the Invention

[0005] This invention proposes a single-image super-resolution reconstruction method based on shallow channel separation and aggregation, primarily applied in the field of image processing. Its main advantage is achieving a balance between parameters, memory usage, and computational cost, resulting in optimal performance. By separating and aggregating channels, features are extracted and information acquisition is increased, thereby achieving better results.

[0006] A single-image super-resolution reconstruction method based on shallow channel separation and aggregation, comprising the following steps:

[0007] Step (1) Processing the training dataset;

[0008] For length and width H r ×W r I HR The high-resolution image is downgraded and scaled up to s times to obtain an image with dimensions (H). r / / s)×(W r / / s) of I LR Low-resolution images; the training set is composed of all high-resolution image pairs and low-resolution image pairs.

[0009] Step (2) Construct an image super-resolution reconstruction network based on shallow channel separation and aggregation;

[0010] The image super-resolution reconstruction network includes a shallow channel separation and aggregation module, a nonlinear global feature aggregation module, and an upsampling module. The shallow channel separation and aggregation module separates and aggregates the channels of low-resolution images to extract their features. The global feature aggregation module aggregates the features obtained from each shallow channel separation and aggregation module. The upsampling module enlarges the images trained by the network to the same size as high-resolution images.

[0011] Step (3) Train an image super-resolution reconstruction network based on shallow channel separation and aggregation.

[0012] Step (4) Complete the image super-resolution reconstruction task using the trained image super-resolution reconstruction network.

[0013] The specific method for step (1) is as follows;

[0014] The publicly available DIV2K dataset was used as the training data. The DIV2K dataset contains 800 training images, 100 validation images, and 100 test images. Of the 800 high-resolution images already present, they were downgraded using bicubic interpolation for a specified downsampling and blurring process, resulting in low-resolution images (I) reduced by the specified factor. LR The low-resolution images of the training data I LR High-resolution images I are used as input images during model training. HR As a comparison image obtained during model training.

[0015] The specific method for step (2) is as follows;

[0016] The image super-resolution reconstruction network based on shallow channel separation and aggregation consists of a first branch and a second branch connected in parallel. The first branch separates the channels, which are then added sequentially to the second branch to enrich the texture information of the shallow network. The first branch contains a first 3D convolutional layer, on which channel separation is performed to obtain n / / n1 channel groups, where n represents the number of channels in the low-resolution image and n1 represents the number of channels in each group. The number of channels in each group is then sequentially added to the shallow channel separation and aggregation module of the second branch.

[0017] The second branch extracts shallow network information from the image, including a second 3D convolutional layer and n / n1 shallow channel separation and aggregation modules, each followed by a 3D convolutional layer. The shallow channel separation and aggregation module consists of three 3D convolutional layers and a non-linear activation function. By concatenating and summing the channels, it can extract richer shallow image geometric information. The global feature aggregation module consists of a subpixel convolution, two 3D convolutional layers, and a non-linear activation function, with the two convolutional layers sharing parameters. The features extracted by each shallow channel separation and aggregation module are input into the global feature aggregation module for better global feature extraction. The upsampling module consists of channel attention and spatial attention mechanisms (CBAM), subpixel convolution, and a 3D convolutional layer, where the number of CBAMs is s. 2 -1. The upsampling module not only magnifies the image, but also uses channel and spatial attention mechanisms to focus more on channels with rich details and texture features.

[0018] The specific method for step (3) is as follows;

[0019] The input to the image super-resolution reconstruction network consists of two parts: First, the low-resolution image is processed through a first 3D convolution, increasing the number of channels from 3 to n. Channels are grouped into nn1 groups, resulting in n / / n1 groups. The features from each group are then added to a shallow channel separation and aggregation module. Second, the same low-resolution image is processed through another 3D convolution, increasing the number of channels to n. A shallow channel separation and aggregation module is used, taking the first n-n1 channels. The channel groups separated by the first branch are then connected to the shallow channel separation and aggregation module. After passing through one 3D convolutional layer, the resulting feature map is input to the next shallow channel separation and aggregation module, repeating this process n / / n1 times. Each connection is also input to a global feature aggregation module to obtain better global features. Then, an upsampling operation is performed to enlarge the image trained by the network to the same size as the high-resolution image.

[0020] The specific method for step (4) is as follows;

[0021] Using the recognized test sets set 5 and set 14, the images are blurred by downsampling at a specified factor using bicubic interpolation, resulting in a low-resolution image I that is reduced by a specified factor. LR Low-resolution image I LR The image super-resolution reconstruction network is fed into a pre-trained image super-resolution reconstruction network for training, and the important parameter of the image super-resolution reconstruction network, psnr, is obtained.

[0022] The beneficial effects of this invention are as follows:

[0023] The innovation of this invention lies in proposing a single-image super-resolution reconstruction method based on shallow channel separation and aggregation. This method divides the feature information extracted from shallow channels into different combinations and gradually incorporates them into a designed deep feature extraction module for feature aggregation and distillation. This facilitates learning the relationship between shallow and deep feature patterns, thereby recovering more fine-grained information. Compared to large, heavyweight networks, this network model requires fewer parameters but achieves comparable experimental results, achieving a good balance between computational resources and performance. Attached Figure Description

[0024] Figure 1 A flowchart of a single-image super-resolution reconstruction method based on shallow channel separation and aggregation provided by the present invention;

[0025] Figure 2 A schematic diagram of the super-resolution network structure based on shallow channel separation and aggregation provided by the present invention;

[0026] Figure 3 This is a schematic diagram of the shallow channel separation and aggregation module structure provided by the present invention;

[0027] Figure 4 This is a schematic diagram of the global feature aggregation module structure provided by the present invention;

[0028] Figure 5 This is a schematic diagram of the upsampling module structure provided by the present invention. Detailed Implementation

[0029] The present invention will now be described in further detail with reference to the accompanying drawings.

[0030] Reference Figure 1 The implementation steps of the present invention will be described in further detail below.

[0031] Step (1) Data preprocessing;

[0032] The DIV2K dataset, a publicly available dataset, was used as the training data. The DIV2K dataset contains 800 training images, 100 validation images, and 100 test images. (The last part, "H," appears to be an incomplete sentence or fragment and doesn't translate directly. It likely refers to a dataset with dimensions H.) r ×W r I HR The high-resolution image is downgraded and scaled up to s times to obtain an image with dimensions (H). r / / s)×(W r / / s) of I LR Low-resolution images. The training set is composed of all high-resolution image pairs and low-resolution image pairs.

[0033] Training data processing: DIV2K contains 800 high-resolution images, which are downgraded. The downgrading method used is bicubic interpolation to perform blurring by a specified factor of downsampling, resulting in a low-resolution image I that is reduced by a specified factor. LR Low-resolution image I LR From high-resolution image I LR The low-resolution image I was obtained by downsampling using bicubic interpolation with a multiplier of 2. LR Both the width and height are high-resolution images. HR 1 / 2. The low-resolution images of the training data I LR High-resolution images I are used as input images during model training. HR As a comparison image obtained during model training.

[0034] Step (2) Construct an image super-resolution reconstruction network based on shallow channel separation and aggregation;

[0035] The image super-resolution reconstruction network includes a shallow channel separation and aggregation module, a nonlinear global feature aggregation module, and an upsampling module. The overall network structure diagram is shown below. Figure 2 As shown in the diagram. The shallow channel separation and aggregation module is used to separate and aggregate the channels of low-resolution images to extract their features. Its module structure diagram is shown in the diagram below. Figure 3 As shown in the diagram. The nonlinear global feature aggregation module aggregates the features obtained from the shallow channel separation and aggregation modules, and its module structure diagram is shown in the diagram. Figure 4 As shown in the diagram. The upsampling module enlarges a scaled-down low-resolution image to the same size as a high-resolution image. Its module structure diagram is shown in the diagram below. Figure 5 As shown.

[0036] The network is divided into two branches. The first branch divides the low-resolution image channels into groups of four, and each group is added to the shallow channel separation and aggregation module. The second branch feeds the low-resolution image into each shallow channel separation and aggregation module, which consists of three convolutional layers and a LeakyReLU activation function. The convolutional kernel size is 3×3, the padding is set to 1, the stride is 1, and the activation function parameter is set to 0.05. The input feature map has 64 channels. After passing through the 3×3 convolutional kernel, the feature map outputs 32 channels, which are then processed by a LeakyReLU activation function. Residual sums and connections are then performed on this basis, and finally, the image features are extracted and output to the next shallow channel separation and aggregation module. The global feature aggregation module consists of a subpixel convolution, two 3D convolutional layers, and a non-linear activation function. Its specific parameter settings are: the first convolutional kernel size is 1×1, the padding is set to 0, and the stride is 1; the second convolutional kernel size is 2×2, the padding is set to 0, and the stride is 2. The two convolutions share parameters. The Leaky ReLU nonlinear activation function is also set to 0.05. The upsampling module consists of channel attention and spatial attention mechanisms (CBAM), subpixel convolution, and a convolutional kernel, where the number of CBAMs is s. 2 -1, s is a multiple, the kernel size is 3×3, the padding is set to 1, and the stride is 1.

[0037] Step (3) Train an image super-resolution reconstruction network based on shallow channel separation and aggregation;

[0038] The input to the image super-resolution reconstruction network consists of two parts: First, a low-resolution image with 3 channels is passed through a 3×3 convolutional layer to increase the number of channels to 64. These channels are grouped into 16 groups of 4. The features from each group are then added to a shallow channel separation and aggregation module. Second, the low-resolution image with 3 channels is also passed through a 3×3 convolutional layer to increase the number of channels to 64. A shallow channel separation and aggregation module is used, taking the first 60 channels. The channel groups separated in the first branch are then connected to the shallow channel separation and aggregation module, passed through a 3D convolutional layer, and the resulting feature map is input to the next shallow channel separation and aggregation module. This process is repeated 16 times. Each connection is also input to a global feature aggregation module to obtain better global features, followed by upsampling to enlarge the image to the size of a high-resolution image.

[0039] Step (4) Complete the image super-resolution reconstruction task using the trained image super-resolution reconstruction network;

[0040] Using the recognized test sets set 5 and set 14, the images were blurred by downsampling by a factor of 2 using bicubic interpolation, resulting in a low-resolution image I that was reduced by a factor of 2.LR Low-resolution image I LR The image super-resolution reconstruction network is fed into a pre-trained image super-resolution reconstruction network for training, and the important parameter of the image super-resolution reconstruction network, psnr, is obtained.

Claims

1. A single-image super-resolution reconstruction method based on shallow channel separation and aggregation, characterized in that, The method includes: Step (1): Processing the training dataset; The high-resolution IHR image with dimensions Hr × Wr is downgraded and scaled to a factor of s to obtain a low-resolution ILR image with dimensions (Hr / / s) × (Wr / / s); all high-resolution image pairs and low-resolution image pairs are combined to form a training set. Step (2): Construct a single-image super-resolution reconstruction network based on shallow channel separation and aggregation; The image super-resolution reconstruction network includes a shallow channel separation and aggregation module, a global feature aggregation module, and an upsampling module; The shallow channel separation and aggregation module is used to separate and aggregate the channels of low-resolution images. It consists of three convolutional layers and a nonlinear activation function. Through channel connection and summation operations, it extracts the geometric information of the shallow image. The global feature aggregation module aggregates the features obtained from each shallow channel separation and aggregation module, and consists of a sub-pixel convolution, two three-dimensional convolutional layers with shared parameters, and a non-linear activation function. The upsampling module enlarges the image to a high-resolution image size and consists of a channel with a spatial attention mechanism CBAM of s²-1, subpixel convolution, and a convolutional layer; Step (3): Train a single-image super-resolution reconstruction network based on shallow channel separation and aggregation; Step (4): Complete the single-image super-resolution reconstruction task using the trained single-image super-resolution reconstruction network.

2. The single-image super-resolution reconstruction method based on shallow channel separation and aggregation according to claim 1, characterized in that, The specific method for step (1) is as follows: The DIV2K dataset, a publicly available dataset, was used as the training data. The DIV2K dataset contains 800 training images, 100 validation images, and 100 test images. The 800 training images in DIV2K are high-resolution images, which were downsampled using bicubic interpolation followed by blurring at a specified downsampling factor to obtain a low-resolution image (ILR) reduced by a specified factor. The low-resolution image ILR was obtained by downsampling the high-resolution image IHR by a factor of s using bicubic interpolation; that is, the width and height of the low-resolution image ILR are both 1 / s of the high-resolution image IHR. The low-resolution image ILR from the training data was used as the input image during model training, and the high-resolution image IHR was used as the comparison image for the images obtained during model training.

3. The single-image super-resolution reconstruction method based on shallow channel separation and aggregation according to claim 1, characterized in that, The specific method for step (2) is as follows: The image super-resolution reconstruction network based on shallow channel separation and aggregation is divided into a first branch and a second branch in parallel. The first branch separates the channels and adds them to the second branch in sequence to enrich the texture information of the shallow network. The first branch contains a first three-dimensional convolutional layer, on which the channel separation operation is performed to obtain n / / n1 channel groups, where n represents the number of channels in the low-resolution image and n1 represents the number of channels in each group. Then, the number of channels in each group is added to the shallow channel separation and aggregation module of the second branch in sequence. The second branch extracts shallow network information of the image, including a second three-dimensional convolutional layer, n / / n1 shallow channel separation and aggregation modules, and a three-dimensional convolutional layer is added after each shallow channel separation and aggregation module; the shallow channel separation and aggregation module is composed of three three-dimensional convolutional layers and non-linear activation functions. By connecting and summing the channels, more rich shallow image geometric information can be extracted. The global feature aggregation module consists of a sub-pixel convolution, two 3D convolutional layers, and a non-linear activation function, with the two convolutional layers sharing parameters; the features extracted by each shallow channel separation and aggregation module are input into the global feature aggregation module to obtain better global features; The upsampling module consists of channel attention and spatial attention mechanisms CBAM, subpixel convolution, and a three-dimensional convolutional layer, where the number of CBAMs is s2_1. The upsampling module not only performs image magnification operations, but also uses channel and spatial attention mechanisms to focus more on channels with rich details and texture features.

4. The single-image super-resolution reconstruction method based on shallow channel separation and aggregation according to claim 1, characterized in that, The specific method for step (3) is as follows: The input to the image super-resolution reconstruction network consists of two parts: the first part is a low-resolution image that is processed by a first three-dimensional convolution, increasing the number of channels from 3 to n, with each n1 channels forming a group, resulting in a total of n / / n1 groups. The features in each group of channels are then added to the shallow channel separation and aggregation module. In the second part, the same low-resolution image is processed through another 3D convolution, increasing the number of channels to n. A shallow channel separation and aggregation module is used to take the first n-n1 channels. The channel groups separated by the first branch are then connected to the shallow channel separation and aggregation module. After passing through a 3D convolutional layer, the resulting feature map is input into the next shallow channel separation and aggregation module. This process is repeated n / / n1 times. Each connection is also input into the global feature aggregation module to obtain better global features. Then, an upsampling operation is performed to enlarge the image trained by the network to the same size as the high-resolution image.

5. The single-image super-resolution reconstruction method based on shallow channel separation and aggregation according to claim 1, characterized in that, The specific method for step (4) is as follows: Using the recognized test set data set5 and set14, the images are blurred by downsampling by a specified factor using bicubic interpolation to obtain a low-resolution image ILR that is reduced by a specified factor. The low-resolution image ILR is then fed into a pre-trained image super-resolution reconstruction network for training, and the important parameter psnr of the image super-resolution reconstruction network, i.e., peak signal-to-noise ratio, is obtained.