Unbalanced sar image recognition method and system based on annular large margin gaussian mixture loss
By constructing a convolutional neural network based on ring-shaped large-margin Gaussian mixture loss, the problems of class imbalance and speckle noise in target recognition of synthetic aperture radar images are solved, thereby improving the recognition accuracy and noise resistance.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHAANXI NORMAL UNIV
- Filing Date
- 2024-04-25
- Publication Date
- 2026-06-23
AI Technical Summary
Existing technologies suffer from class imbalance in target recognition of synthetic aperture radar images and exhibit poor recognition performance under strong speckle noise.
A convolutional neural network is constructed using a method based on ring-based large-margin Gaussian mixture loss. The convolutional neural network model is trained by combining a multi-task loss function composed of large-margin Gaussian mixture, ring loss, Euclidean loss, and total variation, which enhances intra-class compactness and inter-class separability and reduces the influence of speckle noise.
It improves the accuracy and noise resistance of target recognition in synthetic aperture radar images, solves the class imbalance problem, and enhances recognition performance.
Smart Images

Figure CN118379627B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of image processing technology, specifically relating to an unbalanced SAR image recognition method and system based on ring-shaped large-margin Gaussian mixture loss. Background Technology
[0002] Compared with optical remote sensing, synthetic aperture radar (SAR) offers a powerful and versatile tool for remote sensing due to its all-weather, all-time, and wide-coverage advantages. The main purpose of target recognition in SAR images is to extract features from the image and determine the type of target. Target recognition is a crucial step in image interpretation and has received widespread attention from scholars both domestically and internationally. In recent years, deep learning algorithms have driven the development of target recognition in SAR images. However, when deep learning is used for target recognition in SAR, it is affected by class imbalance. Therefore, it is essential to study how to improve the recognition performance in SAR images with class imbalance.
[0003] Existing techniques for class-imbalanced synthetic aperture radar (SAR) image target recognition using convolutional neural networks mostly involve processing the dataset to address the class imbalance problem. However, their performance is not ideal under strong speckle noise. Therefore, this invention aims to disclose a target recognition method that can enhance intra-class compactness and inter-class separability, solve the class imbalance problem in SAR image target recognition, and effectively reduce the impact of speckle noise on SAR images. Summary of the Invention
[0004] To address the problems existing in the prior art, this invention provides an unbalanced SAR image recognition method and system based on ring large marginal Gaussian mixture loss, which is used to improve the target recognition performance of quasi-unbalanced synthetic aperture radar images and enhance the robustness of target recognition to speckle noise.
[0005] This invention is achieved through the following technical solution:
[0006] An imbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss includes the following steps:
[0007] Based on the characteristic that speckle noise in measured SAR images conforms to the Gamma distribution, a training set of images with noise and a test set of images are constructed.
[0008] Convolutional neural networks for imbalanced datasets are constructed separately, along with a multi-task loss function consisting of large marginal Gaussian mixture, loop loss, Euclidean loss, and total variation. Based on the multi-task loss function and training set images, the convolutional neural network model is trained, and the parameters of the trained network model are obtained.
[0009] The test image set is identified based on the parameters of the trained network model, and the identification result is the target identification result of synthetic aperture radar image.
[0010] Furthermore, based on the characteristic that speckle noise conforms to the Gamma distribution, the process of constructing noisy training and test set images is as follows:
[0011]
[0012] Where Γ(·) represents the gamma function and M is the shape parameter;
[0013] According to the formula for calculating the variance of the gamma distribution, the variance of speckle noise is 1 / M. The smaller the value of M, the larger the variance and the stronger the speckle noise.
[0014] Noisy training and test set images are constructed by multiplying random speckle noise with images in each original dataset.
[0015] Furthermore, the process of training the convolutional neural network model based on the multi-task loss function and the training set images is as follows:
[0016] The convolutional neural network consists of a denoising module and a recognition module. The denoising module includes convolutional layers, batch normalization layers, ReLU activation functions, and Tanh activation functions. The recognition module includes convolutional layers, max pooling layers, ReLU activation functions, and fully connected layers. Training set images are sequentially fed into the denoising and recognition modules.
[0017] Furthermore, the process of constructing the multi-task loss function, which consists of large marginal Gaussian mixture, cyclic loss, Euclidean loss, and total variation, is as follows:
[0018] The overall loss function L is:
[0019] L = L LGM-R +αL E +βL TV ;
[0020] Where α, β are predefined parameters, L LGM-R L E L TV For different loss function terms;
[0021] The term L of the annular large-margin Gaussian mixture loss function LGM-R for:
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028] in, L represents the large marginal Gaussian mixture loss function. R Let X represent the cyclic loss, where X = {x1, x2, ..., x}. N} represents the features learned by the network, N is the number of training samples, k represents the k-th (k = 1, 2, ..., K) Gaussian distribution, and p(k) is the prior probability of the k-th Gaussian distribution. n ) represents category k n The prior probability, k n x represents the feature of the nth sample. n The true classes of (n = 1, 2, ..., N) are represented by |·|, where |·| represents the matrix determinant, and ∑ k Let be the covariance of the k-th Gaussian distribution. Represents category k n The covariance of the Gaussian distribution. x represents n With the kth n mean of Gaussian-like distribution Half the square of the Mahalanobis distance between them, d k x represents n The mean μ of the k-th type Gaussian distribution k Half the square of the Mahalanobis distance between them. and Denotes the inverse of the covariance matrix. This indicates the indicator function, where k equals the true class k. n Time function otherwise m represents the interval, ||x n ||2 represents feature x n The L2 norm, also called the Euclidean norm, is given by R, which is a predefined reference radius.
[0029] Euclidean loss function term L E for:
[0030]
[0031] in, Let Y be the input noisy SAR image, and Y be the corresponding original image. It is a network used for noise reduction;
[0032] Total variational loss function term L TV for:
[0033]
[0034] Where i represents the row index of the image, and j represents the column index of the image. This represents the pixel value located in the i-th row and j-th column of the denoised image;
[0035] The process of training the convolutional neural network model and obtaining the parameters of the trained network model is as follows:
[0036] The training data is input into the network for training; the training set includes images. Pass through all modules in sequence;
[0037] During training, the mini-batch stochastic gradient descent algorithm is used. The data batch size, training epochs, learning rate, and learning rate decay value are set, and the values of parameters α and β are set respectively.
[0038] During training, the loss function is calculated through forward propagation, and the weights of the network are updated through backpropagation until the loss function converges, thus obtaining the parameters of the trained network model.
[0039] Furthermore, the large marginal Gaussian mixture loss function term Used to model deep features of a dataset and introduce boundary values between features of different categories;
[0040] The loop loss is used to obtain a balanced representation of each class of sample in the feature space.
[0041] Furthermore, the Euclidean loss function term L E Used to measure the similarity between samples before and after denoising, so that the denoised image The closer it is to the original image Y.
[0042] Furthermore, the total variational loss function term L TV Used to reduce noise in images, decreasing the total variational loss value so that adjacent pixels... and as well as and The values between them tend to be close.
[0043] An imbalanced SAR image recognition system based on ring-shaped large-margin Gaussian mixture loss includes:
[0044] The preprocessing module is configured to construct noisy training and test set images based on the characteristic that speckle noise in measured SAR images conforms to the Gamma distribution.
[0045] The building module is configured to build a convolutional neural network for an imbalanced dataset, and a multi-task loss function consisting of a large marginal Gaussian mixture, a loop loss, an Euclidean loss, and a total variation. Based on the multi-task loss function and the training set images, the convolutional neural network model is trained, and the parameters of the trained network model are obtained.
[0046] The output module is configured to identify the test image set based on the trained network model parameters, and the identification result is the synthetic aperture radar image target identification result.
[0047] A computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of an unbalanced SAR image recognition method based on a ring-shaped large-margin Gaussian mixture loss.
[0048] A computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of an imbalanced SAR image recognition method based on a ring-shaped large-margin Gaussian mixture loss.
[0049] Compared with the prior art, the present invention has the following beneficial technical effects:
[0050] This invention provides an imbalanced SAR image recognition method and system based on a ring-shaped large-margin Gaussian mixture loss, comprising the following steps: constructing noisy training and test set images based on the characteristic that speckle noise in measured SAR images conforms to a Gamma distribution; constructing a convolutional neural network for the imbalanced dataset, and constructing a multi-task loss function composed of a large-margin Gaussian mixture, ring loss, Euclidean loss, and total variation; training the convolutional neural network model based on the multi-task loss function and the training set images to obtain the trained network model parameters; and recognizing the test image set based on the trained network model parameters to obtain the recognition result as the synthetic aperture radar image target recognition result. This application enhances intra-class compactness and inter-class separability, solves the class imbalance problem in SAR image target recognition, and compared with the prior art, this application has the advantages of high recognition accuracy and strong noise resistance, and can be widely used in the field of image processing technology. Attached Figure Description
[0051] Figure 1 This is a flowchart of the unbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss in an embodiment of the present invention;
[0052] Figure 2 This is a schematic diagram of the convolutional neural network structure constructed according to Embodiment 1 of the present invention;
[0053] Figure 3 yes Figure 2A schematic diagram of the noise reduction module;
[0054] Figure 4 yes Figure 2 A schematic diagram of the structure of the identification module. Detailed Implementation
[0055] The present invention will be further described in detail below with reference to specific embodiments. These descriptions are for explanation purposes only and are not intended to limit the scope of the invention.
[0056] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.
[0057] It should be noted that the terms "first," "second," etc., in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of the invention described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.
[0058] This invention provides an imbalanced SAR image recognition method based on a ring-shaped large-margin Gaussian mixture loss, such as... Figure 1 As shown, it includes the following steps:
[0059] Based on the characteristic that speckle noise in measured SAR images conforms to the Gamma distribution, a training set of images with noise and a test set of images are constructed.
[0060] Convolutional neural networks for imbalanced datasets are constructed, and a multi-task loss function consisting of large marginal Gaussian mixture, loop loss, Euclidean loss, and total variation is constructed. Based on this loss function and training set images, the convolutional neural network model is trained to obtain the parameters of the trained network model.
[0061] The identification results are obtained by performing identification based on the test image set, which are synthetic aperture radar image target identification results.
[0062] Preferably, based on the characteristic that the speckle noise conforms to a Gamma distribution, the process of constructing noisy training set images and test set images is as follows:
[0063]
[0064] Where Γ(·) represents the gamma function and M is the shape parameter. According to the variance calculation formula of the gamma distribution, the variance of speckle noise is 1 / M. The smaller the value of M, the larger the variance and the stronger the speckle noise.
[0065] Noisy training and test set images are constructed by multiplying random speckle noise with images in each original dataset.
[0066] Preferably, the process of training the convolutional neural network model based on the multi-task loss function and the training set images is as follows:
[0067] The convolutional neural network consists of a denoising module and a recognition module. The denoising module includes convolutional layers, batch normalization layers, ReLU activation functions, and Tanh activation functions. The recognition module includes convolutional layers, max pooling layers, ReLU activation functions, and fully connected layers. Training set images are sequentially fed into the denoising and recognition modules.
[0068] It should be noted that, in this embodiment, the denoising module in the construction of the convolutional neural network is composed of convolutional layer 1, convolutional layer 2, normalization layer 1, convolutional layer 3, normalization layer 2, convolutional layer 4, normalization layer 3, convolutional layer 5, normalization layer 4, convolutional layer 6, normalization layer 5, and convolutional layer 7 connected in series, with a kernel size of 3×3 and a zero-padding size of 1. The output of the denoising module is...
[0069]
[0070]
[0071] Among them, C i B i ,σ i ,T i Let i represent convolutional layer i, normalized layer i, ReLU activation function i, and Tanh activation function i, respectively.
[0072] Meanwhile, in constructing the convolutional neural network, the recognition module is composed of a convolutional layer 8 with a kernel size of 3×3, a max-pooling layer 1, a convolutional layer 9 with a kernel size of 4×4, a max-pooling layer 2, a convolutional layer 10 with a kernel size of 3×3, a max-pooling layer 3, a convolutional layer 11 with a kernel size of 4×4, a max-pooling layer 4, a fully connected layer 1, and a fully connected layer 2 connected in series. The output of the recognition module is...
[0073]
[0074] Among them, MP i Denotes the max pooling layer i, F i This represents the fully connected layer i.
[0075] Preferably, the process of constructing the loss function is as follows:
[0076] The overall loss function L is:
[0077] L = L LGM-R +αL E +βL TV ;
[0078] Where α, β are predefined parameters, L LGM-R L E L TV For different loss function terms;
[0079] The term L of the annular large-margin Gaussian mixture loss function LGM-R for:
[0080]
[0081]
[0082]
[0083]
[0084]
[0085]
[0086] in, L represents the large marginal Gaussian mixture loss function. R Let X represent the cyclic loss, where X = {x1, x2, ..., x}. N} represents the features learned by the network, N is the number of training samples, k represents the k-th (k = 1, 2, ..., K) Gaussian distribution, and p(k) is the prior probability of the k-th Gaussian distribution. n ) represents category k n The prior probability, k n x represents the feature of the nth sample. n The true classes of (n = 1, 2, ..., N) are represented by |·|, where |·| represents the matrix determinant, and ∑ k Let be the covariance of the k-th Gaussian distribution. Represents category k n The covariance of the Gaussian distribution. x representsn With the kth n mean of Gaussian-like distribution Half the square of the Mahalanobis distance between them, d k x represents n The mean μ of the k-th type Gaussian distribution k Half the square of the Mahalanobis distance between them. and Denotes the inverse of the covariance matrix. This indicates the indicator function, where k equals the true class k. n Time function otherwise m represents the interval, ||x n ||2 represents feature x n The L2 norm, also called the Euclidean norm, is given by R, which is a predefined reference radius.
[0087] Euclidean loss function term L E for:
[0088]
[0089] in, Let Y be the input noisy SAR image, and Y be the corresponding original image. It is a network used for noise reduction;
[0090] Total variational loss function term L TV for:
[0091]
[0092] Where i represents the row index of the image, and j represents the column index of the image. This represents the pixel value located in the i-th row and j-th column of the denoised image;
[0093] The process of training the convolutional neural network model to obtain the parameters of the trained network model is as follows:
[0094] The training data is input into the network for training; the training set includes images. Pass through all modules in sequence;
[0095] During training, the mini-batch stochastic gradient descent algorithm is used, and the data batch size, training epochs, learning rate, and learning rate decay value are set, with the values of parameters α and β set respectively. Specifically, in this embodiment, the data batch size can be set to 64, the training epochs can be set to 100 epochs, the learning rate can be set to 0.01, and the learning rate decay value is set to halve the learning rate after every 50 epochs.
[0096] During training, the loss function is calculated through forward propagation, and the weights of the network are updated through backpropagation until the loss function converges, thus obtaining the parameters of the trained network model.
[0097] Preferably, the large marginal Gaussian mixture loss function term This is used to model the deep features of the dataset and introduce boundary values between features of different categories to enhance the intra-class compactness and inter-class separability of the data. Secondly, in order to ensure that each class of samples is treated equally and to avoid the network biasing towards the class with more samples during training, loop loss is used to ensure that each class of samples is represented in a balanced way in the feature space.
[0098] Preferably, the Euclidean loss function term L E Used to measure the similarity between samples before and after denoising, by making the denoised image... The closer the image is to the original image Y, the better the denoising effect.
[0099] Preferably, the total variational loss function term L TV To reduce noise in an image and make it smoother, specifically, reducing the total variational loss value ensures that adjacent pixels... and as well as and The values are closer together, thus reducing image noise and improving image smoothness.
[0100] Example 1
[0101] The unbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss in this embodiment consists of the following steps:
[0102] Constructing coherent speckle noise conforming to the Gamma distribution:
[0103] Three classes of MSTAR datasets were used as training and testing data, with image size of 128×128. The original training set had an observation elevation angle of 17°, and the original testing set had an observation elevation angle of 15°. The following speckle noise conforming to a Gamma distribution was used:
[0104]
[0105] In this embodiment, L=5 is used to obtain simulated speckle noise, which is multiplied by the original image to obtain the training dataset and the test dataset;
[0106] Constructing convolutional neural networks for imbalanced data:
[0107] The convolutional neural network consists of a denoising module and a recognition module. The denoising module includes convolutional layers, batch normalization layers, ReLU activation functions, and Tanh activation functions. The recognition module includes convolutional layers, max pooling layers, ReLU activation functions, and fully connected layers. Training set images are sequentially fed into the denoising and recognition modules.
[0108] like Figure 2 As shown, the noise reduction and recognition modules in this embodiment are connected in series;
[0109] The denoising module described herein consists of convolutional layers 1, 2, 1, 3, 2, 4, 5, 6, and 7 connected in series, each with a kernel size of 3×3 and a padding size of 1. The output of the denoising module is...
[0110]
[0111]
[0112] like Figure 3 As shown, the noise reduction module in this embodiment is composed of layers connected in series;
[0113] Meanwhile, in constructing the convolutional neural network, the recognition module is composed of a convolutional layer 8 with a kernel size of 3×3, a max-pooling layer 1, a convolutional layer 9 with a kernel size of 4×4, a max-pooling layer 2, a convolutional layer 10 with a kernel size of 3×3, a max-pooling layer 3, a convolutional layer 11 with a kernel size of 4×4, a max-pooling layer 4, a fully connected layer 1, and a fully connected layer 2 connected in series. The output of the recognition module is...
[0114]
[0115] like Figure 4 As shown, the identification module in this embodiment is composed of layers connected in series;
[0116] Construct the loss function:
[0117] The loss function L consists of three parts, and is determined by the following formula:
[0118] L = L LGM-R +αL E +βL TV
[0119] Where α, β are predefined parameters, L LGM-R L E L TV These are different loss function terms; in this embodiment, α and β are taken as 0.01 and 0.02, respectively.
[0120] The term L of the annular large-margin Gaussian mixture loss function is calculated using the following formula. LGM-R :
[0121]
[0122]
[0123]
[0124]
[0125]
[0126]
[0127] The Euclidean loss function term L is calculated using the following formula. E :
[0128]
[0129] The total variational loss function term L is calculated using the following formula. TV :
[0130]
[0131] Training the network:
[0132] The training data is input into the network for training; the training set includes images. All modules are passed sequentially. During training, the mini-batch stochastic gradient descent algorithm is used, with a data batch size of 64, a total of 100 training epochs, and a learning rate of 0.01, which is halved every 50 epochs.
[0133] During training, the loss function is calculated through forward propagation and the network weights are updated through backpropagation until the loss function converges.
[0134] Target recognition:
[0135] The test dataset is input into the trained network, and the recognition result is obtained through the denoising and recognition modules;
[0136] A method for unbalanced SAR image recognition based on ring-shaped large-margin Gaussian mixture loss was developed.
[0137] To verify the effectiveness of this invention in identifying class-imbalanced datasets, the MSTAR three-class dataset was used. The original training set was set with an observation pitch angle of 17°, and the original test set was set with an observation pitch angle of 15°. The number of samples for each class in the dataset is summarized in Table 1.
[0138] Table 1. Number of training and test samples in the MSTAR dataset
[0139]
[0140] A sample of the MSTAR dataset was used to construct its imbalanced class version (MSTAR-Class Imbalance, MSTAR-CI). Specifically, both the training and test sets were constructed to approximately follow an arithmetic progression, meaning that some classes had a larger number of images, while others had a relatively smaller number. The number of samples for each class in the dataset is shown in Table 2.
[0141] Table 2 Number of training and testing samples in MSTAR-CI
[0142]
[0143] The dataset was augmented with speckle noise data of M=5 to verify the performance of the invention on imbalanced datasets. The effectiveness of the invention will be verified through ablation experiments, with the specific settings as follows:
[0144] LGM: The proposed convolutional neural network is trained using MSTAR-CI, with large marginal Gaussian mixture loss as the recognition loss function;
[0145] LGM-R: Based on LGM, a ring loss is further added to the loss function.
[0146] The recognition performance was verified based on the recognition accuracy for each specific category and the overall recognition accuracy. The experimental results are shown in Table 3.
[0147] Table 3 Ablation Experiment Results of the Present Invention
[0148]
[0149] As shown in Table 3, the accuracy using only the large marginal Gaussian mixture loss is biased towards targets with a larger sample size, while the recognition rate for the least numerous BTR70 class is lower. However, after adding the loop loss, the recognition accuracy for the least numerous class, BTR70, is significantly improved. This indicates that the method proposed in this invention can learn the features of each class in a balanced way, rather than favoring the more numerous classes. This verifies the good performance of this invention in recognizing imbalanced datasets.
[0150] To verify the denoising performance of this invention, the similarity between the denoised image and the original image was estimated using PSNR. A higher PSNR value indicates a stronger denoising capability and a denoising result closer to the original clean image. MSE (Mean Separation of Estimates) reflects the degree of difference between the estimator and the estimated quantity; a lower MSE value indicates a more similar denoised image to the original image and a better denoising effect.
[0151] As can be seen from Table 4, the present invention can achieve good performance in suppressing speckle noise in measured SAR images, and still has good denoising performance under strong noise interference.
[0152] Table 4. Noise Reduction Effect Evaluation Indicators
[0153]
[0154] To verify the performance of this invention on noisy datasets, synthetic SAR image samples with different speckle noise intensities were constructed using noise data with different shape parameters M. Example 1 of this invention was compared with the prior art work "K. Simonyan and A. Zisserman, 'Very deep convolutional networks for large-scale image recognition,' in Computer Science, vol.40, no.10, pp.1–14, Sep.2014." as Comparative Experiment 1; "K. He, X. Zhang, S. Ren, and J. Sun, 'Deep residual learning for image recognition,' in Proceedings of the IEEE conference on computer vision and pattern recognition, pp.770-778, 2016." as Comparative Experiment 2; and "S. Chen, H. Wang, F. Xu and Y.-Q. Jin, 'Target Classification Using the DeepConvolutional Networks for SAR Images,' in IEEE Transactions on Geoscience and Remote..." as Comparative Experiment 2. Sensing, vol.54, no.8, pp.4806-4817, Aug.2016.” As a comparative experiment 3, a comparative experiment was conducted, and the recognition accuracy was evaluated. The parameter settings of all networks were the same under the same noise level.
[0155] Table 5 shows the recognition rates under different levels of speckle noise intensity. As M increases, the speckle variance decreases, the image quality improves, and the recognition performance of all methods improves. However, the recognition performance of the present invention is the best under all noise levels. Specifically, the recognition accuracies of comparative experiments 1, 2, and 3 at M=5 are approximately 98.61%, 95.53%, and 91.96%, respectively. The recognition rate of the present invention at M=5 is superior to the above three methods, verifying the effectiveness of the present invention. On the other hand, even under high noise intensity, such as M=0.2, the present invention can still achieve 86.68%, which is better than the other three methods, verifying the robustness of the present invention to speckle noise.
[0156] Table 5 Experimental Results
[0157]
[0158]
[0159] This invention provides an imbalanced SAR image recognition system based on ring-shaped large-margin Gaussian mixture loss, comprising:
[0160] The preprocessing module is configured to construct noisy training and test set images based on the characteristic that speckle noise in measured SAR images conforms to the Gamma distribution.
[0161] The building module is configured to build a convolutional neural network for an imbalanced dataset, and a multi-task loss function consisting of a large marginal Gaussian mixture, a loop loss, an Euclidean loss, and a total variation. Based on the multi-task loss function and the training set images, the convolutional neural network model is trained, and the parameters of the trained network model are obtained.
[0162] The output module is configured to identify the test image set based on the trained network model parameters, and the identification result is the synthetic aperture radar image target identification result.
[0163] In another embodiment of the present invention, a computer device is provided, comprising a processor and a memory. The memory stores a computer program, which includes program instructions. The processor executes the program instructions stored in the computer storage medium. The processor may be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. It is the computing and control core of the terminal, suitable for implementing one or more instructions, specifically suitable for loading and executing one or more instructions in the computer storage medium to achieve a corresponding method flow or corresponding function. The processor described in this embodiment of the present invention can implement the operation of an unbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss.
[0164] In another embodiment of the present invention, a storage medium is provided, specifically a computer-readable storage medium (Memory), which is a memory device in a computer device used to store programs and data. It is understood that the computer-readable storage medium here can include both the built-in storage medium in the computer device and extended storage media supported by the computer device. The computer-readable storage medium provides storage space that stores the terminal's operating system. Furthermore, the storage space also stores one or more instructions suitable for loading and execution by a processor. These instructions can be one or more computer programs (including program code). It should be noted that the computer-readable storage medium here can be high-speed RAM or non-volatile memory, such as at least one disk storage device. The processor can load and execute one or more instructions stored in the computer-readable storage medium to implement the corresponding steps of the unbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss in the above embodiments.
[0165] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0166] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0167] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0168] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0169] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1The steps of the functions specified in one or more boxes. Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. An unbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss, characterized in that, Includes the following steps: Based on the characteristic that speckle noise in measured SAR images conforms to the Gamma distribution, a training set of images with noise and a test set of images are constructed. Convolutional neural networks for imbalanced datasets are constructed separately, along with a multi-task loss function consisting of large marginal Gaussian mixture, loop loss, Euclidean loss, and total variation. Based on the multi-task loss function and training set images, the convolutional neural network model is trained, and the parameters of the trained network model are obtained. The test image set is identified based on the parameters of the trained network model, and the identification result is the target identification result of synthetic aperture radar image.
2. The unbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss according to claim 1, characterized in that, Based on the characteristic that speckle noise follows a Gamma distribution, the process of constructing noisy training and test set images is as follows: Where Γ(·) represents the gamma function and M is the shape parameter; According to the formula for calculating the variance of the gamma distribution, the variance of speckle noise is 1 / M. The smaller the value of M, the larger the variance and the stronger the speckle noise. Noisy training and test set images are constructed by multiplying random speckle noise with images in each original dataset.
3. The unbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss according to claim 1, characterized in that, The process of training a convolutional neural network model based on a multi-task loss function and training set images is as follows: The convolutional neural network consists of a denoising module and a recognition module. The denoising module includes convolutional layers, batch normalization layers, ReLU activation functions, and Tanh activation functions. The recognition module includes convolutional layers, max pooling layers, ReLU activation functions, and fully connected layers. Training set images are sequentially fed into the denoising and recognition modules.
4. The unbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss according to claim 1, characterized in that, The process of constructing the multi-task loss function, which consists of large marginal Gaussian mixture, cyclic loss, Euclidean loss, and total variation, is as follows: The overall loss function L is: L=L LGM-R +αL E +βL TV ; Where α and β are predefined parameters, L LGM-R L E L TV For different loss function terms; The term L of the annular large-margin Gaussian mixture loss function LGM-R for: in, L represents the large marginal Gaussian mixture loss function. R Let X represent the cyclic loss, where X = {x1, x2, ..., x}. N } represents the features learned by the network, N is the number of training samples, k represents the k-th (k = 1, 2, ..., K) Gaussian distribution, and p(k) is the prior probability of the k-th Gaussian distribution. n ) represents category k n The prior probability, k n x represents the feature of the nth sample. n The true classes of (n = 1, 2, ..., N) are given by |·|, which represents the determinant of the matrix, and ∑k is the covariance of the k-th Gaussian distribution. Represents category k n The covariance of the Gaussian distribution. x represents n With the kth n mean of Gaussian-like distribution Half the square of the Mahalanobis distance between them, d k x represents n The mean μ of the k-th type Gaussian distribution k Half the square of the Mahalanobis distance between them. and Denotes the inverse matrix of the covariance matrix. This indicates the indicator function, where k equals the true class k. n Time function otherwise m represents the interval, ||x n ||2 represents feature x n The L2 norm, also called the Euclidean norm, is where R is a predefined reference radius; Euclidean loss function term L E for: in, Let Y be the input noisy SAR image, and Y be the corresponding original image. It is a network used for noise reduction; Total variational loss function term L TV for: Where i represents the row index of the image, and j represents the column index of the image. This represents the pixel value located in the i-th row and j-th column of the denoised image; The process of training the convolutional neural network model and obtaining the parameters of the trained network model is as follows: The training data is input into the network for training; the training set includes images. Pass through all modules in sequence; During training, the mini-batch stochastic gradient descent algorithm is used. The data batch size, training epochs, learning rate, and learning rate decay value are set, and the values of parameters α and β are set respectively. During training, the loss function is calculated through forward propagation, and the weights of the network are updated through backpropagation until the loss function converges, thus obtaining the parameters of the trained network model.
5. The unbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss according to claim 4, characterized in that, The large-margin Gaussian mixture loss function term Used to model deep features of a dataset and introduce boundary values between features of different categories; The loop loss is used to obtain a balanced representation of each class of sample in the feature space.
6. The unbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss according to claim 4, characterized in that, The Euclidean loss function term L E Used to measure the similarity between samples before and after denoising, so that the denoised image The closer it is to the original image Y.
7. The unbalanced SAR image recognition method based on ring-shaped large-margin Gaussian mixture loss according to claim 4, characterized in that, The total variational loss function term L TV Used to reduce noise in images, decreasing the total variational loss value so that adjacent pixels... and as well as and The values between them tend to be close.
8. An unbalanced SAR image recognition system based on ring-shaped large-margin Gaussian mixture loss, characterized in that, The imbalanced SAR image recognition method based on the ring-shaped large-margin Gaussian mixture loss according to any one of claims 1-7 includes: The preprocessing module is configured to construct noisy training and test set images based on the characteristic that speckle noise in measured SAR images conforms to the Gamma distribution. The building module is configured to build a convolutional neural network for an imbalanced dataset, and a multi-task loss function consisting of a large marginal Gaussian mixture, a loop loss, an Euclidean loss, and a total variation. Based on the multi-task loss function and the training set images, the convolutional neural network model is trained, and the parameters of the trained network model are obtained. The output module is configured to identify the test image set based on the trained network model parameters, and the identification result is the synthetic aperture radar image target identification result.
9. A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the unbalanced SAR image recognition method based on the ring-shaped large-margin Gaussian mixture loss as described in any one of claims 1-7.
10. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by the processor, it implements the steps of the unbalanced SAR image recognition method based on the ring-shaped large-margin Gaussian mixture loss as described in any one of claims 1-7.