Multi-modal medical image fusion method and system of multi-channel integration network
By constructing a multi-scale, multi-channel image fusion network model and combining dense residual blocks and multi-spectral channel attention mechanisms, the problem of information loss in multimodal medical image fusion was solved, achieving better image fusion results and improving the accuracy and reliability of diagnosis.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- FOSHAN UNIVERSITY
- Filing Date
- 2023-04-06
- Publication Date
- 2026-06-26
AI Technical Summary
Existing multimodal medical image fusion methods suffer from information and detail loss during the fusion process. In particular, the lack of specialized strategies to preserve fine-grained features in the source images and to consider information differences between scales leads to poor fusion results.
A multi-channel integration network is adopted, which constructs a multi-scale, multi-channel image fusion network model by introducing dense residual blocks. It combines dense residual connection layers, channel extraction modules and multi-spectral channel attention mechanisms to extract and fuse spatial domain information, channel information and fine-grained features of multimodal medical images, and uses a loss function to optimize the fusion results.
It achieves better image fusion results, preserves the details and structural information of multimodal medical images, improves the accuracy and reliability of diagnosis, and reduces misdiagnosis and surgical errors.
Smart Images

Figure CN116433546B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image processing technology, and in particular to a multimodal medical image fusion method and system using a multichannel integrated network. Background Technology
[0002] Due to differences in imaging mechanisms, single-type medical images often struggle to provide doctors with comprehensive diagnostic information. For example, MRI provides soft tissue information and high-resolution anatomical images, but it cannot detect metabolic activity. Positron emission tomography (PET) images offer rich information on tumor function and metabolism, and single-photon emission computed tomography (SPECT) images reflect blood flow in tissues and organs. However, both have low resolution. As mentioned above, multimodal medical image fusion technology addresses this issue. It fuses anatomical and functional images, achieving the goal of possessing both the positional information of anatomical images and preserving the molecular activity information of functional images. In summary, medical image fusion can provide a more comprehensive, reliable, and better description of lesions, thereby aiding biomedical research and clinical diagnosis, such as surgical navigation, radiotherapy planning, and future health prediction.
[0003] Multimodal medical image fusion is typically performed at the pixel level and can be broadly categorized into traditional methods and deep learning methods. While traditional methods have achieved good fusion results, their pixel activity level measurement and weight allocation strategies are designed separately and are not strongly correlated with the fusion method, significantly limiting algorithm performance. Therefore, it is difficult for traditional methods to design an ideal pixel activity level measurement or weight allocation strategy that comprehensively considers all key issues. Meanwhile, due to the excellent feature extraction and data representation capabilities of convolutional neural networks, more and more deep learning-based fusion methods have been proposed. However, existing neural network frameworks still have some problems. For example, they often feed source images into a single network without considering inter-scale information, leading to the loss of some important information; there is no specifically designed strategy to preserve fine-grained features in the source image, resulting in loss of detail and low contrast; and information loss occurs due to the use of global average pooling channel attention mechanisms. Summary of the Invention
[0004] To address the aforementioned technical problems, the present invention aims to provide a multimodal medical image fusion method and system using a multichannel integrated network. This method achieves better image fusion results by fusing different and complementary information from original multimodal medical images into a single image.
[0005] The first technical solution adopted in this invention is a multimodal medical image fusion method using a multi-channel integrated network, comprising the following steps:
[0006] Acquire multimodal medical images to be fused;
[0007] Dense residual blocks are introduced to construct a multi-scale, multi-channel image fusion network model;
[0008] The multimodal medical images to be fused are input into a multi-scale, multi-channel image fusion network model for fusion processing, and the fused multimodal medical images are output.
[0009] Furthermore, the step of acquiring the multimodal medical images to be fused specifically includes:
[0010] Acquire multimodal medical images with RGB color space;
[0011] A multimodal medical image with an RGB color space is converted to a YCrCb color space to obtain a multimodal medical image with a YCrCb color space, wherein the YCrCb color space multimodal medical image includes a Y channel, a Cr channel and a Cb channel;
[0012] The Y channel image of a multimodal medical image in the YCrCb color space is merged with the multimodal medical image in the RGB color space to obtain the multimodal medical image to be fused.
[0013] Furthermore, the step of introducing dense residual blocks to construct a multi-scale, multi-channel image fusion network model specifically includes:
[0014] A multi-scale, multi-channel image fusion network model is constructed, which includes a first convolutional layer, a normalization layer, a linear modification unit, a dense residual block, a channel extraction module, and a second convolutional layer.
[0015] The dense residual block includes a first residual dense connection layer, a second residual dense connection layer, and a third residual dense connection layer. Each residual dense connection layer consists of three convolutional layers with a kernel of 3*3 and one convolutional layer with a kernel of 1*1. Each residual dense connection layer is connected to a channel extraction module.
[0016] Furthermore, the step of inputting the multimodal medical images to be fused into a multi-scale, multi-channel image fusion network model for fusion processing and outputting the fused multimodal medical image specifically includes:
[0017] The multimodal medical images to be fused are input into a multi-scale, multi-channel image fusion network model;
[0018] The first convolutional layer, normalization layer, and linear modification unit of the multi-scale multi-channel image fusion network model are used to perform preliminary feature extraction processing on the multimodal medical images to be fused, and an initial feature map is obtained.
[0019] The initial feature map is input into the first residual dense connection layer, the second residual dense connection layer, and the third residual dense connection layer respectively for feature extraction processing to obtain the corresponding feature map;
[0020] The channel extraction module of the multi-scale multi-channel image fusion network model receives the feature map output by each residual dense connection layer, performs information extraction processing on the feature map, and obtains the spatial domain information, channel information and fine-grained feature information of the corresponding feature map;
[0021] The spatial domain information, channel information, and fine-grained feature information of the feature map are merged to obtain a comprehensive feature map.
[0022] The second convolutional layer of the multi-scale, multi-channel image fusion network model performs fusion processing on the comprehensive feature map to obtain the fused multimodal medical image.
[0023] Furthermore, the step of inputting the initial feature map into the first residual dense connection layer, the second residual dense connection layer, and the third residual dense connection layer for feature extraction to obtain the corresponding feature map specifically includes:
[0024] The initial feature map is input into the first residual dense connection layer, the second residual dense connection layer, and the third residual dense connection layer, respectively.
[0025] Based on the three kernels of the three convolutional layers in each residual dense connection layer, dense connections are used to perform feature extraction processing on the initial feature map to obtain the first feature extraction map, the second feature extraction map and the third feature extraction map;
[0026] Based on a convolutional layer with a kernel of 1*1 for each residual dense connection layer, the first feature extraction map, the second feature extraction map, and the third feature extraction map are fused to obtain a fused feature map;
[0027] The fused feature map is added to the corresponding initial feature map to output the corresponding feature map.
[0028] Furthermore, the feature extraction formula for the fine-grained feature information of the feature map is as follows:
[0029]
[0030] In the above formula, x represents the input feature map, Cov(·) represents the convolutional layer, and Cov n (·) represents an n-level convolutional layer. This represents the Sobel gradient operator. This indicates element-wise addition, and F represents the output of the GRDB module;
[0031] The feature extraction formula for the spatial domain information of the feature map is:
[0032] f = F feat (x in )
[0033] a=σ(F att (f))
[0034]
[0035] In the above formula, x in The input feature map, x out Indicates with x in The result of element-wise addition, where σ represents the sigmoid function, F feat F represents the feature branch convolutional layer. att This represents the attention branch convolutional layer, and f represents the result of the feature branch convolutional layer. This represents element-wise multiplication;
[0036] The feature extraction formula for the channel information of the feature map is:
[0037]
[0038]
[0039] Freq=compress(X)=cat([Freq 0 Freq 1 , ..., Freq n-1 ])
[0040] att = sigmoid(fc(Freq))
[0041]
[0042] In the above formula, H and W represent the length and width of the feature map, respectively. i v i ] indicates the corresponding X i The frequency components are represented by two-dimensional indices, Freq represents the overall compression vector, B represents the two-dimensional DCT frequency components, and att represents the multispectral channel attention map. Let Π represent the frequency domain component, h represent the height of the feature map, w represent the width of the feature map, i represent the i-th row of the feature map, j represent the j-th column of the feature map, and Freq represent the frequency domain component. i Let X represent the compressed vector of the i-th part. i Let represent the i-th feature map, cat(·) represents channel merging, and fc(·) represents a fully connected layer.
[0043] Furthermore, it also includes calculating a loss function based on the multimodal medical image to be fused and the fused multimodal medical image, the expression of which is:
[0044] L = L SSIM +L SF
[0045]
[0046]
[0047] In the above formula, and These represent the multimodal medical image I1 to be fused and the fused multimodal medical image, respectively. The strength, where C1 and C2 represent two constants, and These represent the multimodal medical image I1 to be fused and the fused multimodal medical image, respectively. variance This represents the multimodal medical image I1 to be fused and the fused multimodal medical image. The covariance, SF represents the spatial frequency of the image, and L represents the total loss function, L SSIM L represents the structural similarity loss. SF This represents the spatial frequency domain loss.
[0048] The second technical solution adopted in this invention is: a multimodal medical image fusion system with a multi-channel integrated network, comprising:
[0049] The acquisition module is used to acquire the multimodal medical images to be fused.
[0050] The building block is used to introduce dense residual blocks to construct a multi-scale, multi-channel image fusion network model;
[0051] The fusion module is used to input the multimodal medical images to be fused into a multi-scale, multi-channel image fusion network model for fusion processing, and output the fused multimodal medical images.
[0052] Furthermore, the present invention also provides a terminal device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the computer program including the multimodal medical image fusion method of the multichannel integrated network described in any of the above claims.
[0053] Furthermore, the present invention also provides a computer storage medium, the computer-readable storage medium including a stored computer program, wherein the computer program includes the multimodal medical image fusion method of the multichannel integrated network described in any of the above claims.
[0054] The beneficial effects of the method and system of this invention are as follows: This invention acquires multimodal medical images to be fused and constructs a multi-scale, multi-channel image fusion network model to perform fusion processing on the multimodal medical images to be fused. Three channels are designed for each scale to fully extract different feature information. To fully utilize the hierarchical features between all convolutional layers, nine residual dense connection mechanisms are used in the entire multi-scale, multi-channel image fusion network model. Each channel plays an important role. Gradient residual dense connection blocks can retain sufficient fine-grained texture and details from the source image and achieve feature reuse. The multi-spectral channel attention mechanism can overcome the shortcomings of global pooling in general channel attention mechanisms and extract channel information. The spatial attention mechanism can extract spatial domain information from the feature map. Therefore, this invention fuses different and complementary information from the original multimodal medical images into one image, which can help doctors better analyze difficult and complex diseases and reduce misdiagnosis and surgical errors. Attached Figure Description
[0055] Figure 1 This is a flowchart illustrating the steps of the multimodal medical image fusion method using a multichannel integrated network according to the present invention.
[0056] Figure 2 This is a structural block diagram of the multimodal medical image fusion system of the multichannel integrated network of the present invention;
[0057] Figure 3 This is a structural block diagram of the MCAFusion algorithm, a multi-scale, multi-channel image fusion network model, proposed in this invention.
[0058] Figure 4 This is a structural block diagram of the RDB module in the multi-scale, multi-channel image fusion network model of this invention;
[0059] Figure 5 This is a structural block diagram of the hourglass block in the multi-scale, multi-channel image fusion network model of this invention;
[0060] Figure 6 This is a schematic diagram showing the results of comparing the method of the present invention with three recent and classic image fusion methods. Detailed Implementation
[0061] The present invention will now be described in further detail with reference to the accompanying drawings and specific embodiments. The step numbers in the following embodiments are only for ease of explanation and do not limit the order of the steps. The execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
[0062] Reference Figure 1 and Figure 3This invention provides a multimodal medical image fusion method using a multichannel integrated network, the method comprising the following steps:
[0063] S1. Image preprocessing: For color source images, SPECT needs to be converted from the RGB color gamut to the YCrCb color gamut.
[0064] S2. The two source images are merged along the channel dimension and input into a single convolutional layer to obtain 16 initial feature maps. init ;
[0065] S3, take 16 Feat init The data are fed into three different convolutional networks.
[0066] S4, three different convolutional networks respectively processed 16 Feat init Feature extraction is performed on residual densely connected layers 1, 2, and 3 times, resulting in 16 feature maps, denoted as Feat. Sec ;
[0067] Specifically, dense residual connections, such as Figure 4 As shown, this invention uses RDB to replace most of the ordinary convolutions because, compared with ordinary convolutions, RDB has a better feature extraction capability under the same network parameters due to its complex yet reasonable network structure. Figure 4 As shown, each RDB in this invention has three convolutional layers with a kernel of 3*3 and one convolutional layer with a kernel of 1*1. The first three convolutional layers use dense connections to achieve feature reuse and fully extract the features of multiple convolutional layers. The fourth convolutional layer fuses all the features extracted in the previous layers. Finally, the result extracted by the fourth convolutional layer is added to the input as the final output.
[0068] The main stream first goes through three densely connected convolutional layers, then is fused with all features by a 1*1 filter, and finally the input features need to be added element-wise with the main stream to get the output of the RDB.
[0069] S5. Each convolutional network takes its obtained Feat... Sec The data is input into three different channels, and three types of information are extracted from the feature map: spatial domain information, channel information, and fine-grained features.
[0070] S51, Fine-grained characteristics;
[0071] Specifically, such as Figure 3As shown in (b), the first two convolutional layers use a dense connection mechanism, followed by a convolutional layer with a 1*1 kernel to eliminate differences in channel dimensions. The residual stream mainly uses the Sobel operator to perform gradient operations on the input feature map, and then uses a convolutional layer with a 1*1 kernel to synthesize the gradient information. Finally, the output of the main stream and the output of the residual stream are added element-wise to obtain the final output. The formula for extracting fine-grained features is as follows:
[0072]
[0073] In the above formula, x represents the input feature map, Cov(·) represents the convolutional layer, and Cov n (·) represents an n-level convolutional layer. This represents the Sobel gradient operator. This indicates element-wise addition, and F represents the output of the GRDB module.
[0074] S52, Spatial Domain Information;
[0075] Specifically, such as Figure 5 As shown, x in The input will enter the main feature branch, pass through two convolutional layers, and then the result entering the attention branch will be multiplied element-wise with the input entering the attention branch, finally multiplied with x. in Output x by adding elements one by one out The formula for extracting spatial domain information is as follows:
[0076] f = F feat (x in )
[0077] a=σ(F att (f))
[0078]
[0079] S53, Channel Information;
[0080] Specifically, as follows: Figure 3 As shown in (d), the input feature map will be divided into N parts according to the number of channels, denoted by [X]. 0 X 1 , ..., X n-1 To represent, C represents the number of channels, and n must be divisible by C. Each feature map is assigned a corresponding 2DDCT according to the policy. Then, each feature map is multiplied and added element-wise with the corresponding 2DDCT to obtain Freq. n-1The entire compressed vector is obtained by merging all parts, and then a fully connected neural network operation is performed to obtain the multispectral channel attention map att. Finally, the multispectral channel attention map att is multiplied by the input feature map to obtain the final channel attention map. The formula for extracting channel information is as follows:
[0081]
[0082]
[0083] Freq=compress(X)=cat([Freq 0 Freq 1 , ..., Freq n-1 ])
[0084] att = sigmoid(fc(Freq))
[0085]
[0086] In the above formula, H and W are the length and width of the feature map, respectively, [u i v i ] corresponds to X i The frequency components are indexed in two dimensions, the overall compression vector is Freq, and the grouping is n = 16. For each part, a corresponding two-dimensional DCT frequency component B is assigned. The "Top 16" of best performance provided by the authors are directly selected as the two-dimensional frequency components, where X = [0, 0, 6, 0, 0, 1, 1, 4, 5, 1, 3, 0, 0, 0, 3, 2], Y = [0, 1, 0, 5, 2, 0, 2, 0, 0, 6, 0, 4, 6, 3, 5, 2], u i =X[i], v i =Y[i], i∈{0, 1,...,n-1}.
[0087] S6. Merge the three types of information obtained from each convolutional network along the channel, and then add the merged information from each convolutional network to obtain the comprehensive feature map, denoted as Feat. thd ;
[0088] S7, Finally, Feat thd The fused image is obtained by inputting the last convolutional layer;
[0089] S8. Obtain the fusion map under the Y channel. The loss function is calculated with the source image MRI (I1) and the source image SPECT or PET (I2) under the Y channel, with the aim of preserving as much structural and functional information as possible in the fused image;
[0090] Specifically, its expression is:
[0091] L = L SSIM +L SF
[0092]
[0093]
[0094] In the above formula, and These represent the multimodal medical image I1 to be fused and the fused multimodal medical image, respectively. The intensity, where C1 and C2 represent two constants, and These represent the multimodal medical image I1 to be fused and the fused multimodal medical image, respectively. variance This represents the multimodal medical image I1 to be fused and the fused multimodal medical image. The covariance, SF represents the spatial frequency of the image, and L represents the total loss function, L SSIM L represents the structural similarity loss. SF Indicates spatial frequency domain loss;
[0095] in,
[0096] L SF =β·||SF(I f )-SF(I1)||2+γ·||SF(I f )-SF(I2)||2
[0097] In the above formula, SF represents the spatial frequency of the image, ||·||2 represents the L2 norm, β and γ are weights, and SF is derived by calculating the gradients in the vertical and horizontal directions, reflecting the gray-level transformation of the image. The formula is as follows:
[0098]
[0099] In the above formula, Hor and Ver represent the horizontal and vertical gradients, respectively;
[0100] The formula for the horizontal gradient is as follows:
[0101]
[0102] The formula for the vertical gradient is as follows:
[0103]
[0104] The formula for the horizontal gradient is as follows:
[0105]
[0106] The formula for the vertical gradient is as follows:
[0107]
[0108] In summary, the steps of this invention are as follows: first, convert the RGB image to a YCRCB image; then, merge the color image in the Y channel with the MRI medical image in the same channel and input them into the fusion framework of MCAFusion to obtain a fused image in the Y channel; and finally, convert the fused image in the Y channel back to the final RGB image.
[0109] Furthermore, comparative experimental analysis was conducted based on the algorithm of this invention:
[0110] Reference Figure 6 To facilitate a more intuitive comparison, some local areas were magnified. Overall, all four methods achieved relatively good results, each with its own advantages. However, compared to the method proposed in this invention, existing methods have some shortcomings. The goal of image fusion is to integrate as much information from multiple source images as possible into a single image. Therefore, in the fusion of two types of medical images, the fused image of this invention should possess as much structural information from the source medical structural image and functional information from the functional image as possible. Specifically, the MSRPAN fusion method has a weak ability to preserve MRI data; in the PET-MRI fusion type, the corresponding magnified local areas show that the edges are not smooth enough. The U2Fusion fusion result encountered a low contrast problem, especially in the exoskeleton area of the MRI. The MATR and MCAFusion experimental results are closest. For example, in the PET-MRI fusion, through local magnification, it can be seen that the color and detail of their fusion effect are very close to the source image. However, whether it is the preservation of MRI structural information or the preservation of SPECT and PET functional information, the fusion effect of this application is slightly better than MATR.
[0111] Table 1 shows the average values of the four methods across 19 PET-MRI image pairs for each method under various indices. It can be seen that among the two types of fused images, the AG, QS, SSIM, and SF of the fused image from this invention are the best. Therefore, especially the excellent performance of the average gradient index, reflects that the edge and detail aspects of the fused image from this invention are better than those of the three comparison methods mentioned above. Figure 6 In this invention, the fused image has clearer edges and more image details than other methods. In fusion methods such as PET-MRI, the SCD index of this invention is also the best.
[0112] Table 1. Data results of the method of the present invention compared with three recent and classical image fusion methods under various indicators.
[0113] Methods QNCIE QS QCB VIF AG SSIM SF SCD This application 0.8062 0.9290 0.6920 0.3008 7.8433 0.8154 28.1276 1.4506 U2Fusion 0.8050 0.6561 0.2948 0.2327 5.6260 0.5689 19.1421 0.7796 MSRPAN 0.8072 0.8605 0.6447 0.1815 6.8411 0.7956 27.9082 1.1578 MATR 0.8075 0.9165 0.7039 0.3424 7.0537 0.8074 24.7307 0.6643
[0114] Reference Figure 2 A multimodal medical image fusion system with multi-channel integrated networks, including:
[0115] The acquisition module is used to acquire the multimodal medical images to be fused.
[0116] The building block is used to introduce dense residual blocks to construct a multi-scale, multi-channel image fusion network model;
[0117] The fusion module is used to input the multimodal medical images to be fused into a multi-scale, multi-channel image fusion network model for fusion processing, and output the fused multimodal medical images.
[0118] Another embodiment of the present invention provides a terminal device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the computer program including the multimodal medical image fusion method of the multichannel integrated network described in any of the above claims.
[0119] Another embodiment of the present invention provides a computer storage medium, the computer-readable storage medium including a stored computer program, wherein the computer program includes the multimodal medical image fusion method of the multichannel integrated network described in any of the preceding claims.
[0120] The computer-readable storage medium stores a computer program that, when executed by a processor, implements the steps of the gene data query method in any of the above embodiments. The computer-readable storage medium referred to herein includes random access memory (RAM), memory, read-only memory (ROID), electrically programmable ROM, electrically erasable programmable ROM, register, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art.
[0121] Those skilled in the art will further recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the present invention.
[0122] The content of the above method embodiments is applicable to this system embodiment. The specific functions implemented in this system embodiment are the same as those in the above method embodiments, and the beneficial effects achieved are also the same as those achieved in the above method embodiments.
[0123] The above is a detailed description of the preferred embodiments of the present invention. However, the present invention is not limited to the embodiments described. Those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention. All such equivalent modifications or substitutions are included within the scope defined by the claims of this application.
Claims
1. A multimodal medical image fusion method using a multi-channel integrated network, characterized in that, Includes the following steps: Acquire multimodal medical images to be fused; Dense residual blocks are introduced to construct a multi-scale, multi-channel image fusion network model; The multimodal medical images to be fused are input into a multi-scale, multi-channel image fusion network model for fusion processing, and the fused multimodal medical images are output. The step of inputting the multimodal medical images to be fused into a multi-scale, multi-channel image fusion network model for fusion processing and outputting the fused multimodal medical image specifically includes: The multimodal medical images to be fused are input into a multi-scale, multi-channel image fusion network model; The first convolutional layer, normalization layer, and linear modification unit of the multi-scale multi-channel image fusion network model are used to perform preliminary feature extraction processing on the multimodal medical images to be fused, and an initial feature map is obtained. The initial feature map is input into the first residual dense connection layer, the second residual dense connection layer, and the third residual dense connection layer respectively for feature extraction processing to obtain the corresponding feature map; The channel extraction module of the multi-scale multi-channel image fusion network model receives the feature map output by each residual dense connection layer, performs information extraction processing on the feature map, and obtains the spatial domain information, channel information and fine-grained feature information of the corresponding feature map; The spatial domain information, channel information, and fine-grained feature information of the feature map are merged to obtain a comprehensive feature map. The second convolutional layer of the multi-scale, multi-channel image fusion network model performs fusion processing on the comprehensive feature map to obtain the fused multimodal medical image.
2. The multimodal medical image fusion method using a multi-channel integrated network according to claim 1, characterized in that, The step of acquiring the multimodal medical images to be fused specifically includes: Acquire multimodal medical images with RGB color space; A multimodal medical image with an RGB color space is converted to a YCrCb color space to obtain a multimodal medical image with a YCrCb color space, wherein the YCrCb color space multimodal medical image includes a Y channel, a Cr channel and a Cb channel; The Y channel image of a multimodal medical image in the YCrCb color space is merged with the multimodal medical image in the RGB color space to obtain the multimodal medical image to be fused.
3. The multimodal medical image fusion method using a multi-channel integrated network according to claim 2, characterized in that, The step of introducing dense residual blocks to construct a multi-scale, multi-channel image fusion network model specifically includes: A multi-scale, multi-channel image fusion network model is constructed, which includes a first convolutional layer, a normalization layer, a linear modification unit, a dense residual block, a channel extraction module, and a second convolutional layer. The dense residual block includes a first-order residual dense connection layer, a second-order residual dense connection layer, and a third-order residual dense connection layer. Each residual dense connection layer consists of three cores. The convolutional layer and a kernel are The system consists of convolutional layers, with each residual densely connected layer connected to a channel extraction module.
4. The multimodal medical image fusion method using a multi-channel integrated network according to claim 3, characterized in that, The step of inputting the initial feature map into a first-order residual dense connection layer, a second-order residual dense connection layer, and a third-order residual dense connection layer for feature extraction to obtain the corresponding feature map specifically includes: The initial feature maps are input into the first residual dense connection layer, the second residual dense connection layer, and the third residual dense connection layer, respectively. Based on the three cores of each residual dense connection layer The convolutional layers use dense connections to perform feature extraction on the initial feature maps, resulting in the first, second, and third feature extraction maps. Based on the kernel of each residual dense connection layer The convolutional layer fuses the first feature extraction map, the second feature extraction map, and the third feature extraction map to obtain a fused feature map; The fused feature map is added to the corresponding initial feature map to output the corresponding feature map.
5. The multimodal medical image fusion method using a multichannel integrated network according to claim 4, characterized in that, The feature extraction formula for the fine-grained feature information of the feature map is as follows: In the above formula, Indicates the input feature map, Indicates a convolutional layer. express Cascaded convolutional layers This represents the Sobel gradient operator. This represents element-wise addition. This indicates the output of the GRDB module; The feature extraction formula for the spatial domain information of the feature map is: In the above formula, Represents the input feature map, Indicates and The result is output after adding each element. This represents the sigmoid function. Represents a feature branch convolutional layer. This represents an attention-branch convolutional layer. This represents the result of the feature branch convolutional layer. This represents element-wise multiplication; The feature extraction formula for the channel information of the feature map is: In the above formula, and These represent the length and width of the feature map, respectively. Indicates correspondence Two-dimensional index of frequency components, Represents the overall compressed vector. Represents the frequency components of a two-dimensional DCT. This represents a multispectral channel attention map. Represents frequency domain components, pai means Indicates the height of the feature map, Indicates the width of the feature map. The feature map is represented by the first... OK, The feature map is represented by the first... List, Indicates the first Partial compressed vector, Indicates the first Group feature map, Indicates channel merging. This indicates a fully connected layer.
6. The multimodal medical image fusion method using a multi-channel integrated network according to claim 5, characterized in that, It also includes calculating a loss function based on the multimodal medical image to be fused and the fused multimodal medical image, the expression of which is: In the above formula, and These represent the multimodal medical images to be fused. and fused multimodal medical images The strength, and Indicates two constants, and These represent the multimodal medical images to be fused. and fused multimodal medical images variance This represents the multimodal medical images to be fused. and fused multimodal medical images covariance, Represents the spatial frequency of an image. Represents the total loss function. Represents structural similarity loss. This represents the spatial frequency domain loss.
7. A multimodal medical image fusion system with a multi-channel integrated network, characterized in that, The multimodal medical image fusion method for performing the multichannel integration network as described in claim 1 includes the following modules: The acquisition module is used to acquire the multimodal medical images to be fused. The building block is used to introduce dense residual blocks to construct a multi-scale, multi-channel image fusion network model; The fusion module is used to input the multimodal medical images to be fused into a multi-scale, multi-channel image fusion network model for fusion processing, and output the fused multimodal medical images.
8. A terminal device comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the computer program comprising the multimodal medical image fusion method of a multichannel integrated network as described in any one of claims 1-6.
9. A computer storage medium, characterized in that, The computer-readable storage medium includes a stored computer program, wherein the computer program includes a multimodal medical image fusion method for a multichannel integrated network as described in any one of claims 1-6.