Electromagnetic target recognition method based on interpretable multi-task learning, storage medium and equipment
By constructing a multi-task learning network and generating perturbation signals, and optimizing model weights, the interpretability and noise resistance issues in electromagnetic signal identification are solved, improving identification accuracy and transparency. This approach is applicable to electromagnetic spectrum monitoring and communication security.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SOUTHEAST UNIV
- Filing Date
- 2025-04-07
- Publication Date
- 2026-06-26
AI Technical Summary
Existing electromagnetic signal recognition technologies suffer from problems such as insufficient model interpretability, noise and sparsity challenges, poor adaptability to dynamic environments, and insufficient coordination among branches in multi-task environments. In particular, they lack recognition accuracy and transparency in low signal-to-noise ratio and complex electromagnetic environments.
A multi-task learning network for modulation recognition and individual recognition is constructed. DRSN and cross-stitch modules are used. By generating perturbation signals and masks, a local linear model is trained to optimize the model weights. Constellation diagram visualization is combined to improve model interpretability and recognition accuracy.
It improves the collaboration and noise resistance of multi-task recognition, enhances the robustness of the model under low signal-to-noise ratio, and supports constellation diagram visualization to improve model transparency, making it suitable for signal analysis in complex electromagnetic environments.
Smart Images

Figure CN120408138B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the technical field of electromagnetic signals, and mainly relates to an electromagnetic target recognition method, storage medium and device based on interpretable multi-task learning. Background Technology
[0002] In modern communication systems, Specific Emitter Identification (SEI) and Automatic Modulation Classification (AMC) are two core technologies. SEI focuses on extracting transmitter radio frequency fingerprint (RFF) features from electromagnetic signals and AMC automatically identifies different signal modulation schemes, respectively. Together, they support key applications such as cognitive radio dynamic spectrum optimization, transmitter fingerprint authentication in secure communications, illegal signal monitoring in spectrum management, and signal source localization in electronic warfare. However, features distorted by noise interference, the coupling between modulation schemes and transmitter features, and the inherent black-box uninterpretability of deep learning pose significant challenges to practical engineering. While deep learning remains a mainstream research focus, it faces core issues such as strong data dependence (requiring massive amounts of labeled data), insufficient cross-scenario and cross-frequency band generalization ability, inadequate utilization of physical layer features, and opaque decision-making criteria. These problems collectively constrain the development of electromagnetic signals towards greater robustness and interpretability.
[0003] Existing technologies include signal classification and recognition techniques. For example, Chinese patent CN202011184054.X proposes a method for individual identification of communication radiation sources based on complex deep residual networks, integrating radio frequency fingerprint feature extraction and recognition processes to improve recognition accuracy. This method targets steady-state radio frequency baseband signals, removing noise segments during data acquisition to obtain radio frequency fingerprints, which are then input into a complex deep residual network for identification. However, for transient radio frequency baseband signals, the receiver carrier frequency and phase deviation need to be estimated and compensated before acquisition, and the patent does not address interpretability. Chinese patent CN202010961558.1 discloses a method for individual identification of communication radiation sources based on residual neural networks. This method acquires data through steps such as receiving communication radiation source signals, calculating the signal bispectrum, performing bispectral nonparametric indirect estimation, and obtaining bispectral contour maps to train the residual network. Finally, the trained network is used to detect and identify different communication radiation sources. This method can reduce signal noise interference, reduce computational load, and improve recognition accuracy. However, the patent does not mention the recognition accuracy or interpretability under low signal-to-noise ratio conditions, lacking generalization ability and model transparency. Additionally, Chinese patent CN202110008169.1 provides an automatic identification method for communication signal modulation types. It identifies 2FSK and 4FSK signals by extracting the number of spectral peaks of the unknown signal; identifies 16QAM, BPSK, and QPSK signals by obtaining the time-domain envelope standard deviation; and distinguishes BPSK and QPSK signals by using an instantaneous autocorrelation algorithm to obtain the zero-crossing ratio. This three-level classification improves the recognition performance of each signal. This patent has low recognition accuracy under low signal-to-noise ratio conditions. Furthermore, most existing patents focus on single-task recognition, lacking models with high recognition accuracy in multi-task domains, and existing models lack decision-making interpretability.
[0004] Despite the progress made by existing methods in the field of electromagnetic signal recognition, the following core challenges remain: (1) Insufficient model interpretability: Existing methods (such as Grad-CAM and LIME) have poor adaptability to the signal domain and cannot generate interpretations that conform to the characteristics of electromagnetic signals (such as constellation diagram correlation). (2) Noise and sparsity challenges: Sensor noise, signal attenuation, and multipath effects can easily lead to sparsity or distortion of electromagnetic signals, making it difficult for existing feature extraction methods to effectively extract features. (3) Poor adaptability to dynamic environments: The generalization ability of traditional models decreases significantly in complex electromagnetic environments. (4) Insufficient coordination among branches in multi-task environments: The mutual coupling between branches affects the recognition accuracy. Summary of the Invention
[0005] This invention addresses the problems existing in the prior art by proposing an electromagnetic target recognition method, storage medium, and device based on interpretable multi-task learning. First, a multi-task learning network oriented towards modulation recognition and individual recognition is constructed. The original signal is input into the multi-task learning network, hyperparameters are set, and the network is trained. A generator for output perturbation signals is constructed to segment the original signal and generate an original mask. A perturbation mask is generated by inverting elements, thus generating a perturbation signal. The perturbation signal is input into the trained multi-task learning network to obtain the probability distribution of classification results under different tasks. The predicted probability of the perturbation signal by the multi-task model is used as a supervision label. A local linear model is trained on the original mask and the perturbation mask, and then optimized using model weights W. g As the contribution of each subsequence to the explanation; for W g Normalization is performed, and the weights are mapped back to the original signal length. The time-frequency domain signal regions that play a key role in model decision-making are visualized and labeled based on the constellation diagram.
[0006] To achieve the above objectives, the technical solution adopted by this invention is: an electromagnetic target recognition method based on interpretable multi-task learning, comprising the following steps:
[0007] S1: Construct a multi-task learning network for modulation recognition and individual recognition. The backbone network of the multi-task learning network is DRSN. The encoding network is built by using a hard sharing method and introducing a cross-stitch module. A dual decoding branch is set up, and a residual shrinking module, an average pooling layer and a fully connected layer with the corresponding number of categories are used.
[0008] S2: Input the original signal into the multi-task learning network constructed in step S1, set the hyperparameters, and train the multi-task learning network; the hyperparameters include at least the signal specification, batch size, and number of training rounds.
[0009] S3: Construct a generator to output the perturbation signal, segment the original signal to be interpreted, generate a perturbation mask by randomly reversing elements, and combine the perturbation mask with the original signal to obtain the perturbation signal;
[0010] S4: Input the perturbation signal into the multi-task learning network trained in step S2 to obtain the probability distribution of classification results under different tasks;
[0011] S5: Input the perturbation mask and its similarity into the linear model, and perform model fitting using the probability distribution to obtain the explanatory weight W for each subsequence. g ;
[0012] S6: To W g Normalization is performed, and the weights are mapped back to the original signal length. The time-frequency domain signal regions that play a key role in model decision-making are visualized and labeled based on the constellation diagram.
[0013] As an improvement to the present invention, the cross-stitching structure in step S1 of the cross-stitch module is specifically as follows: assuming the output of the i-th parallel structure is outA i and outB i Then the input A of the (i+1)th parallel structure i+1 and inputB i+1 for:
[0014]
[0015] in These are trainable parameters.
[0016] In the encoding network, the input signal first passes through a convolutional layer, a batch normalization layer, and a ReLU activation layer before entering the cross-stitch module. This module consists of two identical residual shrink blocks and a cross-stitch structure. After passing through multiple repeated cross-stitch modules, shared features are extracted.
[0017] As an improvement of the present invention, the residual shrinkage module in step S1 is equipped with a soft threshold. Features with a soft threshold close to zero are set to zero, while negative features are retained, and the derivative of the features is 0 or 1; the soft threshold function is defined as:
[0018]
[0019] Where x is the input feature, y is the output feature, and τ is the threshold parameter.
[0020] In the dual decoding branch, each branch first passes through multiple residual shrinking modules, and then through an average pooling layer. The output of the average pooling layer is flattened and fed into a fully connected layer corresponding to the number of classes to obtain the prediction result.
[0021] As another improvement of the present invention, in the hyperparameters of step S2, the original signal sample dimension adopts I / Q signal with a specification of (2,1024), the batch size is 32, the training rounds are initially set to 30, and are adjusted according to the loss changes during training.
[0022] As another improvement of the present invention, in step S3, the original signal is divided into d subsequences in the signal generator using the ruptures library to generate the original mask x∈1. d A perturbation mask x' is generated by randomly reversing the elements of x, where x'∈{0,1}. d Replace the noise from a Gaussian distribution. Sample generation, where μ i and σ iThese are the channel mean and standard deviation estimated based on the dataset at each time step i. The signal generator synthesizes a perturbation signal based on x'. Specifically, it replaces the subsequence with a position of 0 in x' with Gaussian noise, while retaining the subsequence with a position of 1.
[0023] As another improvement of the present invention, in step S4, the perturbation signal is input into the trained multi-task learning network, and the classification result is obtained through different task branches of the network. After processing by the Softmax function, the probability of each category predicted by the model is obtained.
[0024] As a further improvement of the present invention, in step S5, the prediction probability of the perturbation signal by the multi-task model is used as a supervision label, and a local linear model is trained on the original mask and the perturbation mask. This is achieved by optimizing the model weights W. g The optimization objective function, representing the contribution of each subsequence to the explanation, is as follows:
[0025]
[0026] Where F represents the Frobenius distance, f model () represents the prediction of a local linear model, and T is the transpose operation.
[0027] As a further improvement of the present invention, the electromagnetic target recognition method based on interpretable multi-task learning as described in claim 7 is characterized in that: in step S6, the interpretation weight W is... g The normalization process is as follows:
[0028]
[0029] To achieve the above objectives, the present invention also adopts the following technical solution: a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the electromagnetic target recognition method based on interpretable multi-task learning as described in any one of claims 1-8.
[0030] To achieve the above objectives, the present invention also adopts the following technical solution: a computer device, comprising:
[0031] Memory, used to store instructions;
[0032] A processor for executing the instructions, causing the computer device to perform the operation of the electromagnetic target recognition method based on interpretable multi-task learning as described in any one of claims 1-8.
[0033] Compared with the prior art, the present invention has the following beneficial effects:
[0034] (1) High synergy among multiple tasks: After optimizing the individual radiation source identification (SEI) and automatic modulation identification (AMC) tasks in collaboration, the identification accuracy of this invention is effectively improved compared with the single-task model;
[0035] (2) Strong noise resistance: The residual shrinking network reduces noise through soft thresholding, which significantly improves the robustness of the model in low signal-to-noise ratio (SNR) scenarios. Especially in AMC (Automatic Modulation Classification) and SEI (Radiation Source Individual Identification) tasks, it can extract weak discriminative features masked by noise more accurately.
[0036] (3) High interpretability: It supports constellation diagram visualization, which allows users to intuitively understand the decision-making basis of the model and meet the needs of military and civilian fields for model transparency;
[0037] (4) Wide range of applications: This invention is applicable to fields such as electromagnetic spectrum monitoring, wireless device identification and communication security, and provides technical support for signal analysis and reliable decision-making in complex electromagnetic environments. Attached Figure Description
[0038] Figure 1 This is a flowchart illustrating the steps of the electromagnetic target recognition method based on interpretable multi-task learning of the present invention.
[0039] Figure 2 This is a structural framework diagram of the electromagnetic target recognition method based on interpretable multi-task learning of the present invention.
[0040] Figure 3 This is an experimental comparison diagram of the test examples of the present invention. Detailed Implementation
[0041] The present invention will be further illustrated below with reference to the accompanying drawings and specific embodiments. It should be understood that the following specific embodiments are for illustrative purposes only and are not intended to limit the scope of the invention.
[0042] Example 1
[0043] Electromagnetic target recognition methods based on interpretable multi-task learning, such as Figure 1 As shown, applicable to, Figure 2 Within the framework shown, the specific steps are as follows:
[0044] Step S1: Multi-task model construction.
[0045] A multi-task learning network for modulation recognition and individual recognition was constructed. The backbone network used was a DRSN, and a coding network was built by introducing a cross-stitch module using a hard-sharing approach. The cross-stitch unit in the cross-stitch module is as follows:
[0046]
[0047] The output of the i-th parallel structure is and The input to the (i+1)th parallel structure is and These are trainable parameters that control the intensity of feature sharing. This structure allows the model to adaptively learn robust representations of low signal-to-noise ratio signals.
[0048] The Cross-Stitch Unit dynamically fuses multi-task features and combines them with the Shrinkage module for adaptive noise suppression. The Shrinkage module suppresses noise using a soft thresholding function, which is:
[0049]
[0050] Where x is the input feature, y is the output feature, and τ is the threshold parameter. The threshold τ is adaptively calculated based on the feature magnitude, and its derivative is a piecewise constant (0 or 1), which can effectively avoid gradient anomalies. The formula is as follows:
[0051]
[0052] In the encoding network, the input signal first passes through convolutional layers, batch normalization layers, and ReLU activation layers before entering the cross-stitch module. This module consists of two identical residual shrink blocks and a cross-stitch structure. After passing through multiple repeated cross-stitch modules, shared features are extracted. The shared features are then fed into a dual decoding branch. Each branch first inputs the shared features into multiple residual shrink blocks and performs average pooling on the output. After processing, the prediction result is output through a fully connected layer corresponding to the class output.
[0053] Step S2: Input the raw I / Q signals into the multi-task learning network for training.
[0054] Hyperparameters were set to train the multi-task model. The sample dimensions used were I / Q signals with a specification of (2, 1024). The batch size was set to 32, and the initial number of training epochs was 30, adjusted according to the loss during training. In this embodiment, the number of training epochs was determined to be 75. The model was run in a PyTorch 1.8.1 environment with Python 3.8.19, using Adam as the optimizer and a learning rate of 1e-3. The computing platform included one NVIDIA GeForce RTX 3080 GPU.
[0055] Furthermore, in the following formula (joint optimization function), the parameter λ used to balance the multi-task learning objective function is set to 0.5.
[0056]
[0057] Where x i These are the received I / Q signal samples. The modulation type label for AMC. For the SEI transmitter label, h(·) is the shared feature extractor, f m (·) and f e (·) represent the task-specific classifiers of AMC and SEI, respectively. m and L e Let λ be the cross-entropy loss function for classification, and λ∈[0,1] be the adaptive task weight parameters.
[0058] Step S3: Construct a generator for the output disturbance signal.
[0059] The original signal is divided into d subsequences using the ruptures library in the signal generator, generating the original mask x∈1. d A perturbation mask x' is generated by randomly reversing the elements of x, where x'∈{0,1}. d Replace the noise from a Gaussian distribution. Sample generation, where μ i and σ i These are the channel mean and standard deviation estimated based on the dataset at each time step i. The signal generator synthesizes a perturbation signal based on x'. Specifically, it replaces the subsequence with a position of 0 in x' with Gaussian noise, while retaining the subsequence with a position of 1.
[0060] Step S4: Obtain the probability distribution.
[0061] The perturbation signal is input into a trained multi-task learning network, and classification results are obtained through different task branches of the network. Then, the Softmax function is used to convert these results into predicted class probabilities. The Softmax function used is: for a real vector z = [z1, z2, ..., zk] containing K elements... K The output of the Softmax function is a vector of length K, σ(z) = [σ(z1), σ(z2), ..., σ(z3)]. K )], where each element σ(z) i The calculation formula for ) is as follows:
[0062]
[0063] Step S5: Train the linear interpretation model.
[0064] Using the predicted probability of the perturbation signal by the multi-task model as the supervision label, a local linear model is trained on the original mask and the perturbation mask, and then optimized using the model weights W. g As the contribution of each subsequence to the explanation.
[0065] Using the original mask x and the perturbation mask x' as input features, and the probability distribution of the perturbation signal obtained from the multi-task model as supervision labels, a linear model is trained to solve for the explanatory weights W of the subsequence. g The objective function to be optimized is:
[0066]
[0067] Where F represents the Frobenius distance, f model () represents the prediction of a locally linear model, and T is the transpose operation. This is achieved through the nearest neighbor function. By weighting the training samples and optimizing the linear model, we obtain the classification decision contribution coefficient W, which reflects the contribution of each subsequence. g The nearest neighbor function assigns higher weights to perturbation masks that are more similar to the original mask.
[0068] Step S6: Normalize the weights and visualize them using a constellation diagram.
[0069] For W g Normalization is performed, and the weights are mapped back to the original signal length. The time-frequency domain signal regions that play a key role in model decision-making are visualized and labeled based on the constellation diagram.
[0070] Explanation weight W g Normalization is performed:
[0071]
[0072] The normalized weights are mapped back to the original signal length. Specifically, the same weight is assigned to each time step in each subsequence, generating a corresponding constellation diagram. First, the constellation diagram is drawn using the original IQ signal. Then, time steps in the constellation diagram whose normalized weight values mapped back to the original signal length exceed a certain threshold are highlighted. The highlighted areas in the constellation diagram represent signal segments that play a crucial role in the model's decision-making.
[0073] Test case
[0074] To obtain the training dataset for the multi-task model, 23 modulation schemes, including QAM, PSK, and PAM, were selected from the open-source Torchsig dataset. For this dataset, seven HackRFOne transmitters were used for signal transmission, and one USRPB210 receiver was used for reception. In the experiment, the carrier frequency was 1MHz, the sampling rate was 16MHz, the signal-to-noise ratio was -15dB, and the received signal length was 1024.
[0075] The training models used were a multi-task model and a single-modulation task model (the structure was the same as the multi-task model except for the non-radioactive individual identification branch). During training, the sample dimension used was I / Q signal with a specification of (2, 1024), and the training and test sets were split in an 8:2 ratio. The batch size was set to 32, the number of training epochs was set to 75, and the system was run in a PyTorch 1.8.1 environment with Python 3.8.19. The optimizer used was Adam, and the learning rate was set to 1e-3. The computing platform included an NVIDIA GeForce RTX 3080 GPU.
[0076] The performance of the two models on the test set is as follows: Figure 3 As shown, the multi-task model improves accuracy across multiple modulation schemes, with an average accuracy improvement of approximately 4%. Ultimately, it can be concluded that the two tasks exhibit strong synergy, and training them together leads to improved accuracy.
[0077] In summary, the electromagnetic signal recognition method, storage medium, and device based on interpretable multi-task learning disclosed in this invention can effectively enhance the electromagnetic signal recognition capability while possessing a certain degree of model interpretability.
[0078] It should be noted that the above content merely illustrates the technical concept of the present invention and should not be construed as limiting the scope of protection of the present invention. For those skilled in the art, various improvements and modifications can be made without departing from the principle of the present invention, and all such improvements and modifications fall within the scope of protection of the claims of the present invention.
Claims
1. An electromagnetic target recognition method based on interpretable multi-task learning, characterized in that, Includes the following steps: S1: Construct a multi-task learning network for modulation recognition and individual recognition. The backbone network of the multi-task learning network is DRSN. The encoding network is built by using a hard sharing method and introducing a cross-stitch module. A dual decoding branch is set up, and a residual shrinking module, an average pooling layer and a fully connected layer with the corresponding number of categories are used. S2: Input the original signal into the multi-task learning network constructed in step S1, set the hyperparameters, and train the multi-task learning network; the hyperparameters include at least the signal specification, batch size, and number of training rounds. S3: Construct a generator to output the perturbation signal. The original signal to be interpreted is segmented. The generator generates a perturbation mask by randomly reversing elements. The perturbation mask is combined with the original signal to obtain the perturbation signal. The original signal is segmented into d subsequences, generating a binary mask. By random reversal The elements generate a perturbation mask. Replace the noise from the Gaussian distribution Sample generation, where and It is each time step Below, based on the channel mean and standard deviation estimated from the dataset, according to Synthesize disturbance signals; S4: Input the perturbation signal into the multi-task learning network trained in step S2 to obtain the probability distribution of classification results under different tasks; S5: Input the perturbation mask and its similarity into the linear model, and perform model fitting using the probability distribution to obtain the explanatory weights for each subsequence. Using the prediction probability of the perturbation signal by the multi-task model as the supervision label, a local linear model is trained on the original mask and the perturbation mask, and then optimized by adjusting the model weights. As the contribution of each subsequence to the explanation, the objective function is optimized as follows: ; in, Represents the Frobenius distance. This represents the prediction of a locally linear model. This is a transpose operation; S6: Yes Normalization is performed, and the weights are mapped back to the original signal length. The time-frequency domain signal regions that play a key role in model decision-making are visualized and labeled based on the constellation diagram.
2. The electromagnetic target recognition method based on interpretable multi-task learning as described in claim 1, characterized in that: The cross-stitch structure in step S1 of the cross-stitch module is specifically as follows: assuming the first... The output of each parallel structure is and Then the first Input of a parallel structure and for: ; in These are trainable parameters.
3. The electromagnetic target recognition method based on interpretable multi-task learning as described in claim 2, characterized in that: The residual shrinkage module in step S1 includes a soft threshold. Features with a soft threshold close to zero are set to zero, while negative features are retained. The derivative of the feature is 0 or 1. The soft threshold function is defined as: ; in As input features, For output features, This is the threshold parameter.
4. The electromagnetic target recognition method based on interpretable multi-task learning as described in claim 1, characterized in that: In the hyperparameters of step S2, the original signal sample dimension adopts I / Q signal with a specification of (2,1024), the batch size is 32, the training rounds are initially set to 30, and are adjusted according to the loss changes during training.
5. The electromagnetic target recognition method based on interpretable multi-task learning as described in claim 4, characterized in that: In step S4, the perturbation signal is input into the trained multi-task learning network, and the classification result is obtained through different task branches of the network. The result is then converted into a probability distribution using the Softmax function.
6. The electromagnetic target recognition method based on interpretable multi-task learning as described in claim 5, characterized in that: In step S6, the explanatory weights are... The normalization process is as follows: 。 7. A computer-readable storage medium, characterized in that: It stores a computer program that, when executed by a processor, implements the electromagnetic target recognition method based on interpretable multi-task learning as described in any one of claims 1-6.
8. A computer device, characterized in that: include Memory, used to store instructions; A processor for executing the instructions, causing the computer device to perform the operation of the electromagnetic target recognition method based on interpretable multi-task learning as described in any one of claims 1-6.