A Radar Signal Modulation Recognition Method Based on Temporal Modeling and Transvariable Fusion
By adopting a radar signal modulation identification method based on time-series modeling and cross-variable fusion, the problem of declining identification performance in existing technologies is solved, and the rapid and accurate identification of modern radar signal modulation patterns is achieved, improving the model's adaptability and the ability to extract joint features of IQ signals.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- XIDIAN UNIV
- Filing Date
- 2026-03-10
- Publication Date
- 2026-06-30
AI Technical Summary
Existing technologies struggle to quickly and accurately identify the modulation patterns of modern radar signals in complex electromagnetic environments, especially when there is dense interweaving of multi-source radiation and the coexistence of intentional and unintentional interference. Traditional methods suffer from reduced identification performance, lack a global perspective, and are unable to extract joint features from IQ signals.
A radar signal modulation recognition method based on temporal modeling and cross-variable fusion is adopted. By preprocessing radar IQ signals in blocks, a hybrid neural network model is constructed, including a block transform temporal encoder and a cross-variable fusion module of modern temporal convolutional network. The multi-head self-attention mechanism of Transformer architecture and the cross-variable fusion module of ModernTCN architecture are used to capture long-range temporal dependencies and extract joint features of IQ signals.
It improves the model's adaptability and generalization ability, enabling it to calculate the correlation between different positions in the sequence globally, explicitly model the correlation between IQ signals, and achieve accurate identification of various radar signal modulation modes, especially under medium to high signal-to-noise ratio conditions.
Smart Images

Figure CN122307491A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of signal processing and pattern recognition technology, and in particular to a radar signal modulation recognition method based on time-series modeling and cross-variable fusion. Background Technology
[0002] Radar signal modulation identification is one of the core technologies in electronic reconnaissance and situational awareness. Essentially, it involves systematically analyzing the parameters and characteristics of intercepted radar signals to accurately identify the modulation patterns used by the target radar. With the rapid development of modern radar technology, multi-functional, software-defined radars have become standard on various advanced platforms. These radars can flexibly select transmission waveforms according to mission requirements and environmental changes, widely employing various signal modulation patterns such as linear frequency modulation (FM), phase coding, frequency agility, and nonlinear frequency modulation, and even combining multiple modulation patterns within the same pulse. Due to the complexity and variability of their modulation patterns and the extremely wide parameter space, accurate identification of radar signal modulation has become an indispensable key technology in applications such as electronic reconnaissance, spectrum management, situational awareness, and decision support. Under traditional electromagnetic environment conditions, due to the limited number of radiation sources and relatively stable signal parameters, effective identification can be achieved using a direct matching method based on feature templates. However, with the rapid development and widespread application of digital radio frequency memory, high-speed digital signal processing technology, and software-based radar architecture, the signal modulation modes of modern radar systems are increasingly showing a trend of diversification, agility, and concealment. The dynamic range of radar signal parameters has significantly expanded, and the flexibility of waveform design has been greatly improved. In addition, the dense interweaving of multi-source radiation and the coexistence of intentional and unintentional interference in the actual electromagnetic space have caused the recognition performance of traditional methods to drop sharply when facing complex scenarios, making it difficult to meet the actual needs for rapid and accurate discrimination of radar signal modulation.
[0003] Currently, various radar signal modulation identification methods have been proposed in existing technologies. Multilayer perceptron-based radar signal modulation identification methods require domain experts to pre-design and extract statistical and physical features. However, artificially pre-designed features are difficult to adaptively cope with new radar systems and unknown signal modulation modes, resulting in limited model generalization ability. Multifunctional radar signal modulation identification methods based on time-series segmentation and clustering are localized processing methods. This localized processing strategy allows the model to only capture local statistical characteristics within each segment, failing to establish global time-series correlations across segments and lacking the ability to perceive the overall signal evolution trend and long-range dependencies, resulting in a lack of global perspective. The original Patch Time Series Transformer (PatchTST) model uses a channel-independent processing strategy to process in-phase (I-path) and quadrature (Q-path) (IQ) signals, artificially severing the inherent correlation and complementarity between the I-path and Q-path, causing the model to fail to extract joint features of the IQ signals. Summary of the Invention
[0004] The purpose of this invention is to provide a radar signal modulation recognition method based on time-series modeling and cross-variable fusion, which solves the problems of limited model generalization ability, lack of global vision, and inability of the model to achieve joint feature extraction of IQ signals caused by existing technologies.
[0005] To address the aforementioned technical problems, the embodiments of the present invention provide the following technical solutions: The first aspect of this invention provides a radar signal modulation identification method based on time series modeling and cross-variable fusion, comprising: The radar IQ signal to be identified is preprocessed into blocks to obtain block tensors; A hybrid neural network model is constructed, which includes a block transform temporal encoder, a modern temporal convolutional network intervariate fusion module, and a classification head. The block transform temporal encoder is used to capture long-range temporal dependencies, and the modern temporal convolutional network intervariate fusion module is used to perform intervariate fusion on the extracted temporal feature tensors. The block tensors are input into the trained hybrid neural network model to process the block tensors using a block transform temporal encoder, a modern temporal convolutional network intervariate mixing module, and a classification head, and output radar signal modulation recognition results. The trained hybrid neural network model is obtained by training the hybrid neural network model with training samples carrying real signal modulation labels.
[0006] A second aspect of the present invention provides a radar signal modulation identification device based on time series modeling and cross-variable fusion, comprising: The block preprocessing module is used to perform block preprocessing on the radar IQ signal to be identified to obtain block tensors; The building module is used to construct a hybrid neural network model, which includes a block transform temporal encoder, a modern temporal convolutional network intervariate fusion module, and a classification head. The block transform temporal encoder is used to capture long-range temporal dependencies, and the modern temporal convolutional network intervariate fusion module is used to perform intervariate fusion on the extracted temporal feature tensors. The recognition module is used to input the block tensors into the trained hybrid neural network model, so as to process the block tensors using a block transform temporal encoder, a modern temporal convolutional network intervariate hybrid module and a classification head, and output the radar signal modulation recognition result; wherein, the trained hybrid neural network model is obtained by training the hybrid neural network model using training samples carrying real signal modulation labels.
[0007] Compared to existing technologies, the radar signal modulation recognition method based on temporal modeling and cross-variable fusion provided by this invention preprocesses the radar IQ signal to be identified into blocks to obtain block tensors; constructs a hybrid neural network model, which includes a block transform temporal encoder, a modern temporal convolutional network cross-variable fusion module, and a classification head. The block transform temporal encoder is used to capture long-range temporal dependencies, and the modern temporal convolutional network cross-variable fusion module is used to perform cross-variable fusion on the extracted temporal feature tensors; inputs the block tensors into the trained hybrid neural network model to process the block tensors using the block transform temporal encoder, the modern temporal convolutional network cross-variable fusion module, and the classification head, and outputs the radar signal modulation recognition result; wherein, the trained hybrid neural network model is obtained by training the hybrid neural network model using training samples carrying real signal modulation labels. In this way, the hybrid neural network model automatically learns hierarchical feature representations directly from the original IQ signals without relying on manual feature engineering, effectively improving the model's adaptability and generalization ability. The introduction of a block transform time-series encoder based on the Transformer architecture utilizes its multi-head self-attention mechanism to calculate the correlation weights between different positions in the sequence globally, fully exploring the dependencies and periodic patterns spanning long time spans in radar pulse sequences, effectively compensating for the shortcomings of local segmentation methods in global modeling capabilities. The hybrid neural network model includes a block transform time-series encoder, a modern temporal convolutional network cross-variable hybrid module, and a classification head. While retaining the advantages of PatchTST temporal modeling, it explicitly models the correlation and complementarity between the IQ signals, achieving joint feature extraction of multi-channel radar signals. Attached Figure Description
[0008] The above and other objects, features, and advantages of exemplary embodiments of the present invention will become readily apparent upon reading the following detailed description with reference to the accompanying drawings. In the drawings, several embodiments of the invention are illustrated by way of example and not limitation, with the same or corresponding reference numerals denoteing the same or corresponding parts, wherein: Figure 1 A flowchart illustrating a radar signal modulation identification method based on time-series modeling and cross-variable fusion is shown schematically. Figure 2 A schematic diagram of a radar signal modulation identification device based on time-series modeling and cross-variable fusion is shown. Detailed Implementation
[0009] Exemplary embodiments of the invention will now be described in more detail with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided to enable a more thorough understanding of the invention and to fully convey the scope of the invention to those skilled in the art.
[0010] It should be noted that, unless otherwise stated, the technical or scientific terms used in this invention should have the ordinary meaning as understood by one of ordinary skill in the art.
[0011] The methods described in the embodiments of the present invention will be explained in detail below.
[0012] Figure 1 A flowchart illustrating a radar signal modulation identification method based on time-series modeling and cross-variable fusion according to an embodiment of the present invention is shown. See [link to flowchart illustration]. Figure 1 As shown, this radar signal modulation identification method based on time-series modeling and cross-variable fusion can include: S101. Perform block preprocessing on the radar IQ signal to be identified to obtain block tensors.
[0013] Specifically, the radar IQ signal to be identified is preprocessed into blocks to obtain block tensors, including: Step A1: Divide the radar IQ signal to be identified into multiple continuous blocks by sliding window according to the block length and sliding step size.
[0014] Step A2: Determine the number of blocks based on the length, block length, and sliding step size of the radar IQ signal to be identified.
[0015] Step A3: Arrange multiple consecutive blocks in chronological order to obtain the block tensor.
[0016] The dimensions of the block tensor are the number of channels, the number of blocks, and the block length.
[0017] Specifically, let the length of the radar IQ signal to be identified, i.e., the length of the original multivariate time series, be... The number of variables (number of channels) is (For IQ dual-channel signals, The original multivariate time series was divided into multiple consecutive patches using a sliding window approach. The specific operation is as follows: Set the patch length to The sliding step size is The number of blocks The calculation formula is: ; For the original multivariate time series After block processing, block tensors are obtained. , among which, the Each block corresponds to an index range in the original multivariate time series. Continuous sampling points, representing a dimension of the number of blocks. Number of variables and block length A real matrix.
[0018] The dataset corresponding to the block tensor is randomly divided into training, validation, and test sets according to a preset ratio. Each dataset is then standardized, and the mean of each channel in the training set is calculated. and standard deviation This statistic is then applied to the normalization operations of the validation and test sets to ensure the consistency of the data distribution.
[0019] S102. Construct a hybrid neural network model.
[0020] The hybrid neural network model includes a block transform temporal encoder, a modern temporal convolutional network intervariate fusion module, and a classification head. The block transform temporal encoder is used to capture long-range temporal dependencies, and the modern temporal convolutional network intervariate fusion module is used to perform intervariate fusion on the extracted temporal feature tensors.
[0021] Specifically, the block transform timing encoder includes a block embedding layer, a position coding layer, and a stacked multi-layer Transformer encoder layer connected in sequence; The block embedding layer is used to linearly project each block from the original sampling space to the feature embedding space to obtain the embedded feature tensor. The position encoding layer is used to add the position encoding vector to the embedded feature tensor to obtain the encoder input representation carrying position information; Stacked multi-layer Transformer encoder layers are used to perform multi-head self-attention calculation and feedforward neural network transformation on the encoder input representation layer by layer, extracting deep temporal features layer by layer to output a temporal feature tensor, which carries long-range temporal dependencies.
[0022] Specifically, in the Patch Time Series Transformer (PatchTST), a patch embedding layer is constructed to embed each patch from... The original sampling space of the dimension is mapped to 3D feature embedding space, represents the feature dimension of the model. The block embedding layer is implemented using linear projection, specifically: ; in, The projected weight matrix is a learnable matrix. For bias vectors, For embedding feature tensors, One dimension is the number of blocks. Number of variables and the feature dimensions of the model A real matrix, One dimension is the feature dimension of the model. A real matrix, One dimension is the block length. and the feature dimensions of the model A real matrix.
[0023] This invention supports two embedding modes: in the shared embedding mode, all channels share the same set of projection parameters. In Independent Embedding mode, each channel uses independent projection parameters. , .
[0024] To enable the model to perceive the positional information of each element in the segmented sequence, PatchTST introduces a positional encoding mechanism. This invention supports multiple positional encoding methods, including: Zero-initialized learnable positional encoding: Initialized as an all-zero vector, the positional representation is adaptively learned through the training process; Sine position coding: A fixed sine-cosine function is used to generate the position coding vector; Learnable positional encoding: a positional embedding vector that is randomly initialized and optimized through backpropagation.
[0025] Position encoding vector After being added to the embedded features, the input representation of the encoder is obtained through regularization by the Dropout layer. One dimension represents the number of blocks. and the feature dimensions of the model A real matrix.
[0026] Specifically, the stacked multi-layer Transformer encoder includes multiple Transformer encoder layers connected in sequence. Each Transformer encoder layer includes a multi-head self-attention sub-layer, a first residual connection and normalization unit, a feedforward neural network sub-layer, and a second residual connection and normalization unit connected in sequence. The multi-head self-attention sublayer is used to perform multi-head self-attention calculation on the encoder input representation and output attention-weighted features. The first residual connection and normalization unit is used to perform residual connection and layer normalization on the attention-weighted features and the encoder input representation to obtain the first normalized features; The feedforward neural network sublayer is used to perform feedforward neural network transformation on the first normalized feature to obtain the feedforward transformed feature. The second residual connection and normalization unit is used to perform residual connection and layer normalization on the feedforward transform features and the first normalized features to obtain the output coding features corresponding to the current Transformer encoder layer. The output encoded features of the current Transformer encoder layer are used as the input of the next Transformer encoder layer. The layers are processed sequentially through each Transformer encoder layer, and the temporal feature tensor is output by the last Transformer encoder layer.
[0027] Specifically, in a multi-head self-attention sub-layer, for the input sequence (in (For batch size), first generate query, key, and value matrices through three independent linear projections: ; in, For query matrix Learnable parameters, Key matrix Learnable parameters, Value matrix Learnable parameters, , , , For the number of attention heads, To calculate the vector dimension of the attention weights, The vector dimension of the extracted information content. One dimension is the feature dimension of the model. and the vector dimension for calculating attention weights A real matrix, One dimension is the batch size multiplied by the number of variables. Number of blocks and the feature dimensions of the model A real matrix.
[0028] Calculate the scaled dot product attention : ; in, Key matrix The transpose. Bullish attention will... The outputs of the individual attention heads are concatenated and then linearly projected to obtain the final output. This invention optionally supports a residual attention mechanism, which accumulates the attention scores from the previous layer to the current layer, enhancing gradient flow.
[0029] The feedforward neural network sublayer employs a two-layer fully connected network structure, connected by the GELU activation function: ; in, The hidden layer dimension of a feedforward network is typically set to... 2 to 4 times, The first normalized feature, It is a randomly deactivated layer.
[0030] The first residual connection and normalization unit and the second residual connection and normalization unit, each sub-layer employ residual connection and layer normalization or batch normalization operations. This invention supports two normalization positioning strategies: pre-normalization and post-normalization. Taking post-normalization as an example: ; Will Several Transformer encoder layers are stacked sequentially to form a complete PatchTST encoder (TSTEencoder). The input block sequence is encoded layer by layer, and the output is a feature tensor. , The output encoded features corresponding to the current Transformer encoder layer. This refers to a sub-layer within the Transformer encoder layer, specifically a multi-head self-attention sub-layer or a feedforward neural network sub-layer. One dimension is the batch size. Number of variables Feature dimensions of the model and number of blocks A real matrix, in which the block sequence of each channel is encoded independently to capture long-range dependencies in the time dimension.
[0031] Specifically, the intervariate blending module of a modern temporal convolutional network comprises multiple intervariate blending blocks connected in sequence.
[0032] Specifically, the radar IQ signal to be identified includes variables from multiple channels, and each intervariate mixing block includes a depth temporal convolution submodule, an intravariate feature mixing submodule, an intervariate feature mixing submodule, and a residual connection unit; The deep temporal convolution submodule is used to perform deep convolution processing on temporal feature tensors through reparameterized large kernel deep convolutional layers to obtain deep temporal convolutional features. The intra-variable feature mixing submodule is used to perform grouped point convolution processing on deep temporal convolution features to achieve the mixing of feature dimensions within each channel to obtain intra-variable mixed features; The cross-variable feature fusion submodule is used to perform dimension transpose and grouped point convolution processing on intra-variable fusion features to achieve information interaction and fusion between different variables and obtain cross-variable fusion features; The residual connection unit is used to perform residual connections between intervariate mixture features and temporal feature tensors to obtain intervariate fusion features output by multiple intervariate mixture blocks.
[0033] Specifically, the Modern Temporal Convolutional Network (ModernTCN) cross-variable mixing module, given that the PatchTST encoder employs a channel-independent strategy, the feature extraction processes for each variable (such as the IQ signals) are isolated, making it impossible to model the correlation between channels. To address this issue, this invention introduces a cross-variable mixing module (V-Mixing Module) based on the ModernTCN architecture. This module explicitly achieves cross-channel feature interaction and fusion through a combination of depthwise separable convolutions and grouped point convolutions.
[0034] The reparameterized large kernel convolution layer is used for local feature extraction in the time dimension. This layer adopts a multi-branch parallel structure during the training phase, and can be equivalently merged into a single convolution operation during the inference phase.
[0035] Training phase structure of reparameterized large kernel convolutional layers: Let the large core size be The small core size is ( Input feature map A two-branch structure is used during training: Large kernel branches are depthwise convolutions (groups= Combining batch normalization, large core branches The formula is: ; in, For large core size Depth convolution, For batch normalization.
[0036] The small kernel branch combines depthwise convolution with batch normalization. The formula is: ; in, For small core size Depth convolution.
[0037] Reparameterization of the output of large kernel convolutional layers Add the two branches together: .
[0038] Merging of inference stages in reparameterized large-kernel convolutional layers: Leveraging the linear properties of batch normalization, the convolutional kernels and biases of large and small kernel branches are merged into an equivalent single large kernel convolution. Specifically, for the convolutional layer weights... and batch normalization parameters The equivalent fusion formula is: ; ; in, The first learnable parameter of the batch normalization layer. The second learnable parameter for the batch normalization layer. The mean value obtained from the batch normalized layer statistics. The standard deviation obtained from the batch normalized layer statistics. The equivalent convolutional kernel weights after fusion. It is a very small constant. This is the equivalent bias after fusion.
[0039] The small kernel weights need to be zero-paddinged to the same size as the large kernel, and then added to the large kernel weights to obtain the final equivalent convolution kernel.
[0040] Multiple V-Mixing Blocks are the core components for realizing channel interaction, and their input and output shapes are all... Each intervariate blending block contains the following deep temporal convolution submodules, intravariate feature blending submodules, intervariate feature blending submodules, and residual connection units: The deep temporal convolution submodule reshapes the input tensor, i.e., the temporal feature tensor, into... Temporal feature extraction is performed using reparameterized large-kernel depthwise convolution. ; in, For deep temporal convolutional features, To reparameterize large kernel convolutions, the number of groups in the depthwise convolution is set to... This ensures that each feature channel undergoes convolution operations independently, maintaining computational efficiency.
[0041] The intra-variable feature blending submodule (FFN1) uses pointwise convolution with groups to blend the feature dimensions within each variable. The number of groups is set to... That is, within each variable The features interact with each other, but there is no information exchange between different variables.
[0042] ; ; ; in, Change the number of channels from Expand to ( ), Restore the number of channels. For extended deep temporal convolutional features, The number of groups is Downward projection The convolution kernel size is 1×1 for a one-dimensional convolution. For the extended deep temporal convolutional features after activation, For mixed features within variables, The number of groups is Upward compression projection The convolution kernel size is 1×1 for a one-dimensional convolution. For the hidden layer dimension of the feedforward network, This represents the feedforward network scaling ratio.
[0043] The intervariable feature mixing submodule (FFN2) enables information exchange between I and Q signals (or more channels). It transposes the tensor by dimension and then applies point convolutions grouped by feature dimensions. The specific operation is as follows: First, change the tensor from... Transpose ; Then apply the group number as Point convolutions enable the interaction and fusion of information from different variables (channels) across each feature dimension: ; ; ; Finally, the tensor is transposed back to its original shape. ,in, For extended intravariable mixing features, For the extended intravariable mixture features after activation, It is a cross-variable mixed feature.
[0044] The residual connection unit performs a residual connection between the outputs of the three sub-modules mentioned above—the deep temporal convolution sub-module, the intra-variable feature mixing sub-module, and the inter-variable feature mixing sub-module—and the block input.
[0045] V-Mixing Stage stacking is a method of stacking cross-variable mixing stages. Several intervariate mixing blocks are stacked sequentially to form the intervariate mixing stage (VMixingStage). The output of each intervariate mixing block serves as the input to the next block, progressively deepening the degree of cross-channel feature fusion.
[0046] Specifically, hybrid neural network models are either terminal cascaded architectures or alternating hybrid architectures; The end-to-end cascaded architecture consists of a block transform temporal encoder, a modern temporal convolutional network cross-variable mixing module, and a classification head connected in sequence. The alternating hybrid architecture includes a block transform temporal encoder and at least one modern temporal convolutional network intervariate hybridization module, wherein the modern temporal convolutional network intervariate hybridization module is inserted between multiple Transformer encoder layers in the block transform temporal encoder, and the output of the alternating hybrid architecture is connected to the classification head.
[0047] Specifically, the hybrid neural network model is an integration of the PatchTST–ModernTCN model.
[0048] Two integration strategies for the PatchTST–ModernTCN model: The end-concatenation mode works as follows: After PatchTST completes temporal feature extraction, its output is directly fed into the intervariate mixing block for channel fusion. The data flow is as follows: ; in, This is the temporal feature tensor output by PatchTST. For cross-variable fusion features, The prediction result is the radar signal modulation identification result.
[0049] The alternating blending mode is specifically as follows: within PatchTST, every... A cross-variable blending stage, or cross-variable blending block, is inserted after the Transformer encoder layer to achieve alternating timing modeling and channel blending. Let the Transformer layer index be... Then, in the case of satisfying Insert the V-Mixing Stage at the position, where, For the insertion frequency of the intervariate mixed block, This represents the total number of layers in PatchTST.
[0050] In alternating blending mode, it is necessary to use the Transformer encoder layer (which processes shapes as...) (tensor) and V-Mixing block (processing shape is Dimension transformation between tensors: Before entering V-Mixing: ; After leaving V-Mixing: .
[0051] Specifically, the classification head consists of a globally average pooling layer, a flattening layer, a Dropout layer, and a fully connected layer connected in sequence; The global average pooling layer is used to perform global average pooling on cross-variable fused features to obtain sequence-level feature vectors. Flattening layer is used to flatten sequence-level feature vectors into one-dimensional feature vectors; The Dropout layer is used to randomly deactivate one-dimensional feature vectors to obtain regularized feature vectors. The fully connected layer maps the regularized feature vectors to the radar signal modulation category space and outputs the category prediction log odds to determine the radar signal modulation recognition result based on the category prediction log odds.
[0052] Specifically, for radar signal modulation classification tasks, this invention adopts the following classification head structure: First, the encoder output, i.e., the cross-variable fusion feature, is processed. In the time dimension ( Global average pooling is performed on the dimension to obtain sequence-level feature vectors. : ; in, For the first Each block, For the first All features corresponding to each block.
[0053] Flatten the sequence-level feature vectors into one-dimensional feature vectors. : ; in, For flattening operation.
[0054] After Dropout regularization, it is mapped to the category space through a fully connected layer: ; in, This represents the total number of radar signal modulation categories. It is a fully connected layer. One dimension is the batch size and the total number of radar signal modulation categories. A real matrix, This is the result of radar signal modulation identification.
[0055] S103. Input the block tensor into the trained hybrid neural network model to process the block tensor using the block transform temporal encoder, the modern temporal convolutional network intervariate hybrid module and the classification head, and output the radar signal modulation recognition result.
[0056] The trained hybrid neural network model is obtained by training the hybrid neural network model using training samples carrying real signal modulation labels.
[0057] Specifically, during the training process of the hybrid neural network model, the cross-entropy loss function is used as the optimization objective for the classification task. For batch sizes of... The training samples, with true labels for , Batch size The true label, the model outputs the class prediction logits. for ,in, For the first The class prediction log odds for each sample. .
[0058] First, apply the Softmax function to the predicted logits to obtain the class probability distribution: ; in, For the first The sample belongs to the first Probability distribution of classes For the first The sample at the th The class predicts the log odds. For the first The first sample The predicted log odds for a class.
[0059] Cross-entropy loss The calculation formula is: ; in, For the first The true class label of each sample For the first Each sample in the true category label The predicted probability distribution on For the first Each sample in the true category label The predicted value.
[0060] During the training of the hybrid neural network model, the AdamW optimizer is used to update the model parameters. AdamW introduces decoupled weight decay on top of the Adam optimizer, effectively alleviating the overfitting problem. The optimizer parameter update rules are as follows: ; ; ; ; in, The current gradient is the i-th Gradient of training rounds, The first momentum coefficient, The second momentum coefficient is usually taken as... , For learning rate, This is the weight decay coefficient. It is the numerical stability constant. For the first First-order momentum of training rounds For the first First-order momentum of training rounds For the first Second-order momentum of training rounds, For the first Second-order momentum of training rounds, The first-order momentum after bias correction. This is the second-order momentum after bias correction. For the model in the first The instantaneous state of a training round. For the model in the first The instantaneous state of a training round.
[0061] During the training of the hybrid neural network model, a cosine annealing learning rate scheduling strategy is adopted to ensure that the learning rate decays smoothly during training. ; in, The initial learning rate, To minimize the learning rate, For the current training round, For the total number of training rounds, For the first The learning rate of each training round.
[0062] The complete training process for a hybrid neural network model is as follows: (a) Initialize model parameters and set hyperparameters (batch size) Learning rate Training rounds Weight decay coefficient (etc. training); (b) Traverse the training rounds : b1. Randomly shuffle the training set according to batch size. Divided into several batches; b2. For each training batch : Forward propagation: ; Calculate the loss: ; Backpropagation: Calculate the gradient of the loss with respect to the model parameters. ; Parameter update: Update the model parameters according to the AdamW update rules; where, For the first Each batch of input data, For the first The actual labels of each batch For the first The model predicts the log odds for each batch. For the first Loss value for each batch, The gradient of the loss with respect to the parameters.
[0063] b3. Update learning rate: Adjust the current learning rate according to the cosine annealing strategy; b4. Evaluate the model performance on the validation set and record the validation accuracy; b5. If the current validation accuracy is better than the historical best, save the model checkpoint; (c) After training is completed, load the best performing model checkpoint on the validation set as the final model, i.e., the trained hybrid neural network model.
[0064] Specifically, radar signal modulation includes multiple preset categories, and the output radar signal modulation identification results include: Step B1: Process the log odds of the class predictions output by the classification head using the Softmax function to obtain the posterior probability distributions of multiple preset classes.
[0065] Step B2: Select the category corresponding to the maximum probability from the posterior probability distribution as the numerical category index.
[0066] Numeric Category Index The expression is: ; in, For the first The predicted probability distribution of the class For the first The predicted log odds of a class For the current sample The class predicts the log odds.
[0067] Also output the confidence score for this category. This is used to assess the reliability of the identification results.
[0068] Step B3: Based on the preset category label mapping table, convert the numerical category index into the corresponding radar signal modulation semantic label.
[0069] Step B4: Use the maximum probability as the confidence score, and output the confidence score and the radar signal modulation semantic label.
[0070] By outputting the confidence score and the semantic label of the radar signal modulation, intelligent recognition of radar signal modulation can be achieved.
[0071] To make the technical solution, implementation method, and beneficial effects of the present invention clearer, the present invention will be further described in detail below with reference to specific embodiments. This embodiment uses the RML2016.10a radio modulation signal dataset as the experimental object. This dataset contains IQ signal samples with various modulation modes, which can effectively simulate the signal characteristic differences under different radar signal modulations, and is used to verify the effectiveness of the method of the present invention.
[0072] The experimental environment configuration for the method of this invention is as follows: The hardware environment is as follows: the central processing unit is an Intel(R) Xeon(R) CPU E5-2667v3 @ 3.20GHz processor; the graphics processor is dual NVIDIA RTX8000 graphics cards (each with 48GB VRAM); and the memory is 64GB of Random Access Memory (RAM).
[0073] The software environment is as follows: the operating system is Ubuntu 20.04 LTS; the Python version is 3.10; the deep learning framework is PyTorch 2.7, which supports CUDA acceleration; and the dependent libraries are NumPy, Pandas, Scikit-learn, etc.
[0074] This embodiment uses the RML2016.10a dataset for experimental verification. This dataset is a widely used benchmark dataset in the field of radio machine learning, and its characteristics are as follows: Dataset basic information: There are 11 modulation mode categories, including 8 digital modulation modes (BPSK, QPSK, 8PSK, 16QAM, 64QAM, BFSK, CPFSK, PAM4) and 3 analog modulation modes (WB-FM, AM-SSB, AM-DSB). The signal-to-noise ratio (SNR) ranges from -20dB to +18dB, with a step size of 2dB, and a total of 20 SNR levels. Each modulation mode contains 1000 samples per SNR. Each sample is an IQ dual-channel signal with 128 time sampling points, and the sample dimension is... .
[0075] Application of the dataset for radar signal modulation identification verification: The RML2016.10a dataset was originally designed for radio modulation identification. It contains multiple modulation modes, each corresponding to different signal waveform structures, spectral characteristics, and statistical features, which are highly similar to radar signal modulation. Therefore, this embodiment uses the modulation categories in the RML2016.10a dataset to simulate radar signal modulation categories to verify the ability of the method of this invention to identify radar signal modulation. According to the technical solution of this invention, the PatchTST–ModernTCN model is constructed, and the specific parameter configuration is as follows: Table 1 Data Preprocessing Parameters
[0076] Table 2 PatchTST Parameters
[0077] Table 3. Parameters of the V-Mixer (Transvariable Mixer)
[0078] Table 4 Classification Header Parameters
[0079] Table 5 Training Parameters
[0080] Table 1 lists the data preprocessing parameters, including input sequence length, number of channels, block length, stride, number of blocks, and batch size. Table 2 lists the PatchTST parameters, including the number of Transformer layers, number of attention heads, model dimensions, feedforward network dimensions, Dropout ratio, embedding mode, activation function, and residual attention. Table 3 lists the cross-variable mixing block parameters, including the number of V-Mixing blocks, large kernel convolution size, small kernel convolution size, FFN expansion ratio, V-Mixer Dropout, and alternating insertion frequency. Table 4 lists the classification head parameters, including the number of output classes, head dropout, and pooling method. Table 5 lists the training parameters, including the number of training epochs, initial learning rate, and random seed.
[0081] As an optional embodiment of the method of the present invention, the specific operations include: Step 1: Data loading and preprocessing.
[0082] Load the RML2016.10a dataset from the specified path. The data format is a structured file containing IQ signals and corresponding modulation mode labels.
[0083] Set the random seed to 42 to ensure the reproducibility of data partitioning and model initialization.
[0084] A 5-fold cross-validation strategy was adopted. Select fold 0 ( As the current data partitioning scheme for the experiment, the dataset is divided into training set, validation set and test set. This represents the number of parts the dataset is divided into during cross-validation.
[0085] According to the block parameters ( , The time series is patched, and the original input is divided into... Convert to block tensor .
[0086] Step 2: Model building.
[0087] Based on the above parameter configuration, instantiate the PatchTST–ModernTCN model. The model structure includes: Patch embedding layer: Maps a 16-dimensional patch to a 256-dimensional feature space.
[0088] Position coding layer: Employs zero-initialized learnable position coding.
[0089] PatchTST: A 4-layer Transformer encoder, each layer containing 8 self-attention and feedforward networks.
[0090] Modern temporal convolutional network intervariate mixing module: 1 intervariate mixing block, using end-cascade mode.
[0091] Classification Header: Global Average Pooling, Dropout, and Fully Connected Layer.
[0092] Step 3: Loss function and optimizer configuration.
[0093] Construct a cross-entropy loss function with class weights. : ; in, The category weights calculated in step 2, One-hot encoding of the real label. This represents the class probabilities predicted by the model.
[0094] Configure the AdamW optimizer with an initial learning rate set to .
[0095] Configure a cosine annealing learning rate scheduler to allow the learning rate to decay smoothly from its initial value over 800 batches.
[0096] Step 4: Model training.
[0097] Execute the training loop, for a total of 800 batches: Each batch iterates through all training batches, performing forward propagation, loss calculation, backpropagation, and parameter update; After each batch, evaluate the model performance on the validation set; Record metrics such as training loss, validation loss, training accuracy, and validation accuracy; Save the model checkpoint when the validation accuracy reaches its historical best.
[0098] Step 5: Model testing and result evaluation.
[0099] Load the best-performing model checkpoints on the validation set.
[0100] Perform model inference on the test set to obtain the class prediction log odds for each sample.
[0101] The predicted class label is obtained by taking the index of the maximum value of the predicted log odds for each class (argmax).
[0102] Calculate evaluation indicators: Accuracy is the proportion of correctly classified samples out of the total number of samples.
[0103] The macro-average F1 score is the arithmetic mean of the F1 scores for each category, which is used to comprehensively evaluate precision and recall.
[0104] Through the above implementation steps, the PatchTST–ModernTCN cascade model proposed in this invention has been fully trained and tested on the RML2016.10a dataset, achieving an accuracy of 61.09% and a macro-average F1 score of 61.88% on the test set. Experimental results show that: Regarding recognition accuracy: The method of this invention achieved a high overall recognition accuracy on the test set across the entire SNR range (-20dB to +18dB), and performed particularly well under medium to high signal-to-noise ratio conditions (SNR≥0dB), verifying the model's effective ability to distinguish multi-mode signals.
[0105] Cross-channel modeling effect: By introducing the V-Mixing module to explicitly model the correlation between the I and Q signals, compared with the original PatchTST channel-independent strategy, the model can make fuller use of the joint feature information of the I and Q signals, and improve the ability to distinguish modulation modes with similar timing features but different phase characteristics.
[0106] Temporal Dependency Capture: PatchTST effectively captures the long-range dependencies and periodic patterns of signals in the time dimension through a multi-head self-attention mechanism, which has significant advantages for modulation pattern recognition with complex temporal structures.
[0107] The method of this invention solves the problem of relying on manual feature design in multilayer perceptron-based methods, and realizes end-to-end automatic feature learning; it solves the problem of lack of global vision in temporal segmentation and clustering methods, and realizes effective modeling of long-range temporal dependencies; it solves the contradiction between the channel independence assumption of the PatchTST model and the strong correlation characteristics of radar IQ signals, and realizes joint feature extraction of multi-channel signals.
[0108] This invention employs a sliding window patching strategy to preprocess the original radar IQ time series. Let the length of the original time series be... The number of channels is The block length is The sliding step size is The number of blocks This block-based strategy transforms the original input tensor into... Convert to block tensor This reduces computational complexity while preserving the local structural information of the time series.
[0109] This invention constructs a PatchTST based on a multi-head self-attention mechanism to capture long-range temporal dependencies in radar pulse sequences. The encoder comprises the following core components: a patch embedding layer that linearly projects each patch from... 3D sampling space mapping to The feature embedding space supports both shared and independent embedding modes; the position encoding layer introduces learnable or fixed position encoding vectors, enabling the model to perceive the position information of each element in the patch sequence; each layer of the multi-layer Transformer encoder layer contains a multi-head self-attention sub-layer and a feedforward neural network sub-layer, employing residual connections and normalization operations, and achieving deep feature extraction by stacking multiple encoders.
[0110] This invention introduces a cross-variable hybridization module for explicitly modeling the correlation and complementarity between radar IQ signals, resolving the contradiction between the PatchTST channel independence assumption and the strong correlation characteristics of radar IQ signals. This module includes the following core components: a reparameterized large-kernel deep convolutional layer employs a dual-branch structure with both large and small kernels for temporal feature extraction, enhancing model expressiveness during training and effectively merging into a single convolutional operation during inference to improve computational efficiency; an intra-variable feature mixing submodule with a grouping number of... Point convolutions are used to mix the feature dimensions within each variable; the cross-variable feature mixing submodule uses a group number of... Point convolutions enable information interaction and fusion between different variables (I and Q signals) in each feature dimension; residual connection units connect the module output and input with residuals to ensure gradient flow and training stability.
[0111] This invention provides two cascaded integration strategies: End-to-end cascade mode: After PatchTST completes temporal feature extraction, the output is directly sent to the cross-variable mixing stage for channel fusion, and then the recognition result is output through the classification head; or, alternating mixing mode: Inside PatchTST, a cross-variable mixing stage is inserted after every few Transformer encoder layers to achieve alternating temporal modeling and channel mixing, simultaneously capturing the sequence dependency in the temporal dimension and the variable interaction in the channel dimension at different levels of abstraction.
[0112] This invention employs a global average pooling strategy to aggregate the temporal dimension of the encoder output, compressing sequence-level features into a fixed-length feature vector. Specifically, for the encoder output... exist The average of the dimensions is taken to obtain the sequence-level feature representation, which is then flattened and mapped to the category space through Dropout regularization and a fully connected layer to achieve classification and recognition of radar signal modulation.
[0113] The PatchTST–ModernTCN architecture in this invention organically combines the global self-attention mechanism of Transformer with the local feature extraction capability of temporal convolutional networks. Through an alternating or cascaded module combination strategy, it simultaneously captures global sequence dependencies in the temporal dimension and variable interactions in the channel dimension at different levels of abstraction, providing an efficient and accurate technical solution for intelligent recognition of radar signal modulation in complex electromagnetic environments.
[0114] Based on the above Figure 1As can be seen from the implementation method, the embodiments of the present invention perform block preprocessing on the radar IQ signal to be identified to obtain block tensors; construct a hybrid neural network model, which includes a block transform temporal encoder, a modern temporal convolutional network intervariate fusion module, and a classification head. The block transform temporal encoder is used to capture long-range temporal dependencies, and the modern temporal convolutional network intervariate fusion module is used to perform intervariate fusion on the extracted temporal feature tensors; input the block tensors into the trained hybrid neural network model to process the block tensors using the block transform temporal encoder, the modern temporal convolutional network intervariate fusion module, and the classification head, and output the radar signal modulation recognition result; wherein, the trained hybrid neural network model is obtained by training the hybrid neural network model using training samples carrying real signal modulation labels. In this way, the hybrid neural network model automatically learns hierarchical feature representations directly from the original IQ signals without relying on manual feature engineering, effectively improving the model's adaptability and generalization ability. The introduction of a block transform time-series encoder based on the Transformer architecture utilizes its multi-head self-attention mechanism to calculate the correlation weights between different positions in the sequence globally, fully exploring the dependencies and periodic patterns spanning long time spans in radar pulse sequences, effectively compensating for the shortcomings of local segmentation methods in global modeling capabilities. The hybrid neural network model includes a block transform time-series encoder, a modern temporal convolutional network cross-variable hybrid module, and a classification head. While retaining the advantages of PatchTST temporal modeling, it explicitly models the correlation and complementarity between the IQ signals, achieving joint feature extraction of multi-channel radar signals.
[0115] Based on the same inventive concept, as an implementation of the above-mentioned radar signal modulation identification method based on time-series modeling and cross-variable fusion, this embodiment of the invention also provides a radar signal modulation identification device based on time-series modeling and cross-variable fusion. Figure 2 This is a structural diagram of the radar signal modulation identification device based on time-series modeling and cross-variable fusion in an embodiment of the present invention. See also... Figure 2 As shown, the radar signal modulation identification device based on time-series modeling and cross-variable fusion may include: Block preprocessing module 201 is used to perform block preprocessing on the radar IQ signal to be identified to obtain block tensors; Module 202 is used to build a hybrid neural network model. The hybrid neural network model includes a block transform temporal encoder, a modern temporal convolutional network intervariate fusion module, and a classification head. The block transform temporal encoder is used to capture long-range temporal dependencies, and the modern temporal convolutional network intervariate fusion module is used to perform intervariate fusion on the extracted temporal feature tensors. The recognition module 203 is used to input the block tensor into the trained hybrid neural network model, so as to process the block tensor using the block transform temporal encoder, the modern temporal convolutional network intervariate hybrid module and the classification head, and output the radar signal modulation recognition result; wherein, the trained hybrid neural network model is obtained by training the hybrid neural network model using training samples carrying real signal modulation labels.
[0116] The block preprocessing module 201 is specifically used to divide the radar IQ signal to be identified into multiple continuous blocks by sliding window according to the block length and sliding step size; determine the number of blocks according to the length of the radar IQ signal to be identified, the block length and the sliding step size; and arrange the multiple continuous blocks in chronological order to obtain a block tensor, the dimensions of which are the number of channels, the number of blocks and the block length.
[0117] In the recognition module 203, the radar signal modulation recognition result is output, including: processing the category prediction log odds output by the classification head using the Softmax function to obtain the posterior probability distribution of multiple preset categories; selecting the category corresponding to the maximum probability from the posterior probability distribution as the numerical category index; converting the numerical category index into the corresponding radar signal modulation semantic label according to the preset category label mapping table; using the maximum probability as the confidence score, and outputting the confidence score and the radar signal modulation semantic label. The radar signal modulation includes multiple preset categories.
[0118] It should be noted that the above description of the radar signal modulation identification device embodiment based on time-series modeling and cross-variable fusion is similar to the description of the radar signal modulation identification method embodiment based on time-series modeling and cross-variable fusion, and has similar beneficial effects. For technical details not disclosed in the embodiments of the radar signal modulation identification device based on time-series modeling and cross-variable fusion of the present invention, please refer to the description of the radar signal modulation identification method embodiment based on time-series modeling and cross-variable fusion of the present invention for understanding.
[0119] The above are merely specific embodiments of the present invention, but the scope of protection of the present invention is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in the present invention should be included within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.
Claims
1. A radar signal modulation identification method based on time-series modeling and cross-variable fusion, characterized in that, include: The radar IQ signal to be identified is preprocessed into blocks to obtain block tensors; A hybrid neural network model is constructed, which includes a block transform temporal encoder, a modern temporal convolutional network intervariate fusion module, and a classification head. The block transform temporal encoder is used to capture long-range temporal dependencies, and the modern temporal convolutional network intervariate fusion module is used to perform intervariate fusion on the extracted temporal feature tensors. The block tensor is input into the trained hybrid neural network model to process the block tensor using the block transform temporal encoder, the modern temporal convolutional network intervariate mixing module, and the classification head, and output the radar signal modulation recognition result; wherein, the trained hybrid neural network model is obtained by training the hybrid neural network model using training samples carrying real signal modulation labels.
2. The method according to claim 1, characterized in that, The process of performing block preprocessing on the radar IQ signal to be identified to obtain block tensors includes: The radar IQ signal to be identified is divided into multiple consecutive blocks by a sliding window based on the block length and sliding step size. The number of blocks is determined based on the length of the radar IQ signal to be identified, the block length, and the sliding step size; The multiple consecutive blocks are arranged in chronological order to obtain the block tensor, and the dimensions of the block tensor are the number of channels, the number of blocks, and the block length.
3. The method according to claim 2, characterized in that, The block-based transform timing encoder includes a block embedding layer, a position encoding layer, and a stacked multi-layer Transformer encoder layer connected in sequence. The block embedding layer is used to linearly project each block from the original sampling space to the feature embedding space to obtain the embedded feature tensor; The location coding layer is used to add the location coding vector to the embedded feature tensor to obtain the encoder input representation carrying location information; The stacked multi-layer Transformer encoder is used to sequentially perform multi-head self-attention calculation and feedforward neural network transformation on the encoder input representation, extracting deep temporal features layer by layer to output the temporal feature tensor, which carries the long-range temporal dependency.
4. The method according to claim 3, characterized in that, The stacked multi-layer Transformer encoder layer includes multiple Transformer encoder layers connected in sequence. Each Transformer encoder layer includes a multi-head self-attention sub-layer, a first residual connection and normalization unit, a feedforward neural network sub-layer, and a second residual connection and normalization unit connected in sequence. The multi-head self-attention sub-layer is used to perform multi-head self-attention calculation on the encoder input representation and output attention-weighted features; The first residual connection and normalization unit is used to perform residual connection and layer normalization on the attention-weighted features and the encoder input representation to obtain the first normalized features; The feedforward neural network sublayer is used to perform feedforward neural network transformation on the first normalized feature to obtain the feedforward transformed feature. The second residual connection and normalization unit is used to perform residual connection and layer normalization on the feedforward transform feature and the first normalization feature to obtain the output coding feature corresponding to the current Transformer encoder layer; Specifically, the output encoded features corresponding to the current Transformer encoder layer are used as the input to the next Transformer encoder layer, processed sequentially through each Transformer encoder layer, and the temporal feature tensor is output by the last Transformer encoder layer.
5. The method according to claim 1, characterized in that, The modern temporal convolutional network intervariate mixing module includes multiple intervariate mixing blocks connected in sequence.
6. The method according to claim 5, characterized in that, The radar IQ signal to be identified includes variables from multiple channels, and each intervariable mixing block includes a deep temporal convolution submodule, an intravariable feature mixing submodule, an intervariable feature mixing submodule, and a residual connection unit. The deep temporal convolution submodule is used to perform deep convolution processing on the temporal feature tensor through a reparameterized large kernel deep convolutional layer to obtain deep temporal convolutional features. The intra-variable feature mixing submodule is used to perform grouped point convolution processing on the depth temporal convolution features to achieve feature dimension mixing within each channel and obtain intra-variable mixed features. The cross-variable feature fusion submodule is used to perform dimension transpose and grouped point convolution processing on the intra-variable fusion features to achieve information interaction and fusion between different variables and obtain cross-variable fusion features; The residual connection unit is used to perform a residual connection between the intervariate mixture feature and the temporal feature tensor to obtain the intervariate fusion feature output by the multiple intervariate mixture blocks.
7. The method according to claim 3, characterized in that, The hybrid neural network model is a terminal cascaded architecture or an alternating hybrid architecture; The terminal cascaded architecture includes the block transform temporal encoder, the modern temporal convolutional network cross-variable mixing module, and the classification head connected in sequence. The alternating hybrid architecture includes the block transform temporal encoder and at least one of the modern temporal convolutional network cross-variable hybrid modules, wherein the modern temporal convolutional network cross-variable hybrid modules are inserted between the multi-layer Transformer encoder layers in the block transform temporal encoder, and the output of the alternating hybrid architecture is connected to the classification head.
8. The method according to claim 6, characterized in that, The classification head includes a global average pooling layer, a flattening layer, a Dropout layer, and a fully connected layer connected in sequence. The global average pooling layer is used to perform global average pooling on the cross-variable fusion features in the time dimension to obtain a sequence-level feature vector. The flattening layer is used to flatten the sequence-level feature vector into a one-dimensional feature vector; The Dropout layer is used to randomly deactivate the one-dimensional feature vector to obtain a regularized feature vector. The fully connected layer is used to map the regularized feature vector to the radar signal modulation category space and output the category prediction log odds to determine the radar signal modulation identification result based on the category prediction log odds.
9. The method according to claim 8, characterized in that, The radar signal modulation includes multiple preset categories, and the output radar signal modulation identification result includes: The log odds of the category predictions output by the classification head are processed using the Softmax function to obtain the posterior probability distribution of the multiple preset categories. From the posterior probability distribution, select the category corresponding to the maximum probability as the numerical category index; According to the preset category label mapping table, the numerical category index is converted into the corresponding radar signal modulation semantic label; The maximum probability value is used as the confidence score, and the confidence score and the radar signal modulation semantic label are output.
10. A radar signal modulation identification device based on time series modeling and cross-variable fusion, characterized in that, include: The block preprocessing module is used to perform block preprocessing on the radar IQ signal to be identified to obtain block tensors; A building module is used to build a hybrid neural network model, which includes a block transform temporal encoder, a modern temporal convolutional network intervariate fusion module, and a classification head. The block transform temporal encoder is used to capture long-range temporal dependencies, and the modern temporal convolutional network intervariate fusion module is used to perform intervariate fusion on the extracted temporal feature tensors. The identification module is used to input the block tensor into the trained hybrid neural network model, so as to process the block tensor using the block transform temporal encoder, the modern temporal convolutional network intervariate mixing module and the classification head, and output the radar signal modulation identification result; wherein, the trained hybrid neural network model is obtained by training the hybrid neural network model using training samples carrying real signal modulation labels.