Automatic modulation recognition method based on multi-frequency Mamba under federated learning
By combining federated learning and multi-frequency Mamba models, the problems of scarce training samples and data security risks are solved, enabling efficient and secure communication signal modulation recognition in complex electromagnetic environments, and improving recognition accuracy and model adaptability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NANJING UNIV OF SCI & TECH
- Filing Date
- 2026-04-22
- Publication Date
- 2026-06-26
AI Technical Summary
Existing modulation recognition technologies face challenges such as scarce training samples, insufficient feature mining of long-term signals, difficulty in deploying large models in a lightweight manner, and data security risks associated with centralized training. In particular, they are difficult to achieve efficient and secure modulation recognition of communication signals in complex electromagnetic environments.
A federated learning architecture is adopted, which combines a multi-frequency Mamba model and a rotation-flip joint enhancement method. By training locally at edge nodes and expanding samples, a multi-frequency Mamba model is constructed for signal denoising and feature extraction. A weighted model averaging algorithm is used to achieve distributed collaborative training, which ensures data privacy and improves recognition accuracy.
It achieves efficient and secure communication signal modulation recognition in complex electromagnetic environments, reduces computational and communication load, improves the model's generalization ability and recognition accuracy, and adapts to the needs of distributed collaborative recognition.
Smart Images

Figure CN122069147B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of wireless communication signal processing and artificial intelligence, and in particular to an automatic modulation identification method based on multi-frequency Mamba under federated learning. Background Technology
[0002] As the core foundation of wireless communication systems, modulation technology, by shifting baseband signals to high-frequency carriers, not only solves the antenna size matching problem but also significantly improves spectrum utilization and anti-interference performance through higher-order modulation and multiplexing techniques. It is widely used in 5G / 6G mobile communications, satellite data links, the Internet of Things, and military tactical radios. Diverse modulation techniques greatly enhance the performance of communication systems. However, communication is a two-way process. At the receiving end, to accurately recover the original information (demodulation), the modulation style and parameters used by the transmitting end must be strictly matched. In practical non-cooperative communication (such as signal reconnaissance) or modern adaptive communication scenarios, the receiver often cannot predict which modulation method the transmitter is currently using. In this case, traditional fixed demodulators will fail. Clearly, conducting research on signal modulation recognition is essential, enabling receivers to identify signal modulation methods under blind conditions, thus providing crucial support for electronic reconnaissance, deciphering enemy intelligence, dynamic access channels, and ensuring quality of service.
[0003] Early modulation identification relied primarily on manual work by professionals who analyzed image features such as time-frequency diagrams and constellation diagrams, combining this with experience to determine the modulation scheme. However, this method is highly subjective, inefficient, and ill-suited to complex signal environments. In recent years, the introduction of deep learning technology has significantly propelled the development of modulation identification. This type of method, with its powerful feature learning capabilities, can automatically extract high-dimensional features from raw signal data, exhibiting good recognition performance, especially under low signal-to-noise ratio (SNR) conditions. However, in applications facing real-world complex electromagnetic environments, it still faces several key bottlenecks. First, training samples are scarce. Existing models heavily rely on massive amounts of high-quality labeled data. However, in non-cooperative communication or dynamically changing battlefield environments, acquiring and labeling a large number of signal samples is extremely costly and challenging, leading to models being prone to overfitting under small sample conditions and a significant decrease in generalization ability. Secondly, the insufficient ability to represent the spatiotemporal features of long sequences limits the recognition upper limit. Communication signals are essentially long-term data containing rich amplitude and phase variations. Existing network architectures often struggle to simultaneously extract local features and capture global dependencies, easily losing crucial spatiotemporal correlation information when processing long sequences, thus affecting the discriminability of complex modulation patterns. Furthermore, the increasingly complex and massive structures of deep models lead to enormous computational resource consumption and slow convergence. This not only increases the model iteration and optimization cycle but also makes it difficult to meet the stringent requirements of rapid algorithm deployment and real-time updates in scenarios such as cognitive radio. Finally, the lack of data security and privacy protection is a significant issue. Traditional centralized training models require the aggregation of raw data from each node to a central server. In scenarios involving military secrets or core commercial data, this easily leads to data leakage risks, making it difficult to break down data silos and incurring high communication costs, severely hindering the development of distributed collaborative recognition. Summary of the Invention
[0004] The purpose of this invention is to provide an automatic modulation recognition method based on multi-frequency Mamba under federated learning. This method can effectively solve the problems of scarce training samples, insufficient mining of long-term signal features, difficulty in lightweight deployment of large models, and data security risks associated with centralized training in existing modulation recognition technologies. It expands training data by relying on sample augmentation techniques, accurately captures multi-frequency and temporal features of signals using a multi-frequency Mamba model, and achieves distributed collaborative training by combining federated learning. The original data is retained throughout the process and not transmitted externally. While ensuring data security, it improves the accuracy and generalization ability of communication signal modulation recognition in complex electromagnetic environments, meeting the practical needs of blind recognition in various civilian wireless communication scenarios.
[0005] To achieve the above objectives, this invention provides an automatic modulation identification method based on multi-frequency Mamba under federated learning, comprising the following steps:
[0006] S1. Build a federated training system consisting of a central server and M edge monitoring nodes. The raw modulation signal data of each edge monitoring node is stored and processed locally and is not transmitted over the network.
[0007] S2. Each edge monitoring node acquires time-domain radio frequency signals and converts them into I / Q signal sequences of the original modulation signals. A rotation-flip joint enhancement method is used to perform multidimensional geometric transformation on the original modulation signals to expand the local sample set.
[0008] S3. Construct a multi-frequency Mamba model and deploy it on edge nodes. Learn frequency domain weights to complete signal denoising, extract the frequency domain representation of the signal, and extract the spatiotemporal features of long sequence signals.
[0009] S4. The central server initializes and broadcasts global model weights based on the multi-frequency Mamba model. Each edge monitoring node trains its local multi-frequency Mamba model using an expanded sample set and then uploads the weights. The central server aggregates and updates the weights using a weighted model averaging algorithm and feeds back the updated global weights to the edge monitoring nodes.
[0010] S5. Repeat step S4 until the global model recognition accuracy tends to stabilize and the global model after synchronous convergence of each edge monitoring node converts the real-time collected time-domain radio frequency signal into an I / Q signal sequence and then performs automatic modulation mode recognition.
[0011] Preferably, S1 specifically includes:
[0012] S11. Establish a federated training system consisting of a central server as the global aggregator and M edge monitoring nodes as local trainers;
[0013] S12. The central server initializes the global weights of the multi-frequency Mamba model. ;
[0014] S13. Before training begins, the central server will initialize the weights. Or the global weight after each iteration Broadcast to all selected active edge monitoring nodes to ensure consistent initial state.
[0015] Preferably, the rotation-flip joint enhancement method in S2 specifically includes:
[0016] S21. Let the original signal be... The rotated extended signal sample is The transformation formula is:
[0017] ;
[0018] ;
[0019] in, It is the rotation angle;
[0020] Select Rotate the signal counterclockwise at one angle;
[0021] S22. Perform a flip-over transformation on the modulated signal. Perform horizontal and vertical flips; the signal after the flip transformation is: The calculation formula is:
[0022] ;
[0023] in, The in-phase component after the flip transformation. These are the orthogonal components after the flip transformation;
[0024] S23. The local sample set is expanded by combining rotation and flip transformations.
[0025] Preferably, the multi-frequency Mamba model constructed in S3 adopts a federated learning-multi-frequency Mamba algorithm architecture to realize signal denoising, signal frequency domain representation extraction and long sequence signal spatiotemporal feature extraction, including a built-in spectrum correction module, multi-frequency attention module and Mamba-2 backbone network;
[0026] The spectrum correction module sequentially performs fast Fourier transform, full-band soft mask correction, and inverse fast Fourier transform on the input time-domain I / Q sequence, learns frequency domain weights to complete signal denoising, and outputs the denoised time-domain I / Q sequence and the corresponding signal frequency domain representation.
[0027] The multi-frequency attention module receives the frequency domain representation of the signal output from the spectrum correction module, slices the frequency domain representation of the signal along the channel dimension, extracts the frequency domain features of each channel using discrete cosine transform, and generates channel attention weight vectors through a bottleneck structure MLP network to enhance key frequency band features and suppress noise interference.
[0028] The Mamba-2 backbone network receives sequences enhanced by features from a multi-frequency attention module. Based on a structured state-space dual mechanism, it employs block decomposition technology to achieve parallel training. It completes the spatiotemporal feature extraction of long sequence signals in a recursive mode with constant memory usage, achieving an inference complexity of O(L), where L is the length of the time-domain I / Q sequence.
[0029] Preferably, the implementation logic of the spectrum correction module in S3 is as follows:
[0030] S31. Use Fast Fourier Transform to convert the length of... Mapping time-domain I / Q sequences to frequency-domain complex-valued sequences:
[0031] ;
[0032] in, These are frequency domain complex coefficients. The length of the time-domain I / Q sequence. The imaginary unit, For frequency index number, For time index number, These are the sampled values of the in-phase components of the time-domain I / Q sequence. These are the sampled values of the orthogonal components of the time-domain I / Q sequence;
[0033] S32, frequency domain complex coefficients Decoupling to real part and the virtual part The inputs are fed into two parallel two-layer perceptron networks respectively;
[0034] S33. Use the Tanh activation function to learn the full-band soft mask correction coefficients, and perform corrections on the real and imaginary parts separately. The correction formula is:
[0035] ;
[0036] in, , These are the mapping functions of two two-layer perceptrons, respectively. for The corrected real part, for The corrected imaginary part, This is an element-wise multiplication operation. This is the core activation function of the spectrum correction module, used to implement frequency domain soft mask adaptive correction;
[0037] S34. Correct the real part and the virtual part Reconstructed into frequency domain complex coefficients The formula is:
[0038] ;
[0039] The time-domain representation signal, restored by inverse fast Fourier transform, is given by the following formula:
[0040] ;
[0041] in, These are the sampled values of the time-domain I / Q sequence after denoising.
[0042] Preferably, the implementation logic of the multi-frequency attention module in S3 is as follows:
[0043] S35. Slice the feature map, which is composed of the frequency domain representation of the signal output by the spectrum correction module, along the channel dimension to form a set of feature sequences. ,in, Representing the Time series of each channel Total number of channels;
[0044] S36. Using discrete cosine transform instead of global average pooling, extract single-channel frequency domain representations for each channel sequence. Single-channel frequency domain characterization The formula is:
[0045] ;
[0046] in, For the first One-dimensional discrete cosine transform of time series of each channel The total number of frequency domain points, For the first The time series of the first channel in the first... The sampled values at each time point;
[0047] Before selection The coefficients, which contain low-frequency and mid-to-high-frequency components, constitute the compressed frequency domain descriptor. It is a positive integer;
[0048] S37. After concatenating the frequency domain representations of each channel, input them into the bottleneck structure MLP network, and generate the channel attention weight vectors using the Sigmoid function. The formula is:
[0049] ;
[0050] in, and These represent the weight matrices for the dimensionality reduction and dimensionality increase layers, respectively. Represents the ReLU function. This represents the Sigmoid function.
[0051] Preferably, the implementation logic of the Mamba-2 backbone network in S3 is as follows:
[0052] S38. A structured state-space dual mechanism is adopted. The frequency domain feature sequence after being weighted by the channel attention weight vector output by the multi-frequency attention module is received as input. During the training phase, the state equation is recursively transformed into parallel matrix multiplication using block decomposition technology.
[0053] S39. In the inference stage, a recursive mode with constant memory usage is used to extract the spatiotemporal features of long sequence signals. Its inference complexity is O(L), and the core state update and output equations are:
[0054] ;
[0055] ;
[0056] in, for The hidden state vector at each time step represents Spatiotemporal characteristics of the signal at any given moment for The state vector is hidden at all times. The diagonal state transition matrix defined for the structured state-space dual mechanism. For the input projection matrix, To output the projection matrix, for The input feature vector at each time step is the channel attention weight vector passed through S37. The weighted frequency domain descriptor concatenation sequence, for The feature vector is output at each time step, and The spatiotemporal features of the extracted long sequence signal.
[0057] Preferably, S4 specifically includes:
[0058] S41. After the central server initializes and broadcasts the global model weights, each edge monitoring node trains a multi-frequency Mamba model locally using the expanded sample set, and the model parameters are updated through mini-batch stochastic gradient descent.
[0059] For the The edge monitoring node at the ... Round For batch training, the parameter update formula is:
[0060] ;
[0061] in, For learning rate, Indicates the first Calculate the parameters of the multi-frequency Mamba model based on the data from each edge monitoring node. For the first The edge monitoring node at the ... Round The loss function for multi-frequency Mamba models during batch training. For loss function For model parameters The gradient;
[0062] S42. The central server is responsible for aggregating client parameters using a weighted average algorithm. The aggregation process is represented as follows:
[0063] ;
[0064] in, The new global model weights are obtained after aggregation by the central server. The total number of edge monitoring nodes participating in the training. For the first Number of samples in the local augmented sample set of each edge monitoring node. The total number of samples in the local augmented sample set for all edge monitoring nodes. For the first The edge monitoring node at the ... The parameters of the multi-frequency Mamba model are uploaded after the training round is completed.
[0065] Preferably, S5 specifically includes:
[0066] The federated training process in step S4 is repeated until the modulation recognition accuracy of the global multi-frequency Mamba model stabilizes. The parameters of the global multi-frequency Mamba model, after synchronous convergence of all edge monitoring nodes, are used to convert the real-time acquired time-domain RF signal into an I / Q signal sequence of the original modulation signal. Then, automatic modulation mode recognition is performed on this I / Q signal sequence to achieve classification and prediction of the RF signal modulation type. Therefore, the automatic modulation recognition method based on multi-frequency Mamba under federated learning described above has the following beneficial effects:
[0067] (1) This invention combines federated learning architecture with multi-frequency Mamba model design. The original modulation signal data of each edge monitoring node is stored and processed locally, without the need to transmit the original data through the network. While realizing collaborative training of the model, it avoids the risk of data leakage from the data source and ensures the privacy and transmission security of wireless communication signal data.
[0068] (2) The multi-frequency Mamba model constructed in this invention integrates the spectrum correction module, the multi-frequency attention module and the Mamba-2 backbone network. It achieves signal denoising through frequency domain weight learning, accurately extracts the frequency domain features of the signal by combining the multi-frequency attention mechanism, and completes the spatiotemporal feature extraction of long sequence signals by relying on the structured state space dual mechanism of Mamba-2. The model structure is adapted to the feature extraction requirements of radio frequency signals, and improves the feature mining targeting of modulation recognition.
[0069] (3) The present invention uses a rotation-flip joint enhancement method to perform multidimensional geometric transformation on the original modulation signal to expand the local sample set. Combined with the federated training strategy of weighted model averaging, it effectively alleviates the model training bias caused by non-independent and identically distributed data in distributed scenarios. At the same time, the federated training only transmits model weights instead of the original data, which greatly reduces the network communication transmission overhead. The model inference stage has a constant level of memory usage, which also reduces the computational load of edge nodes and improves the engineering practicality of the method.
[0070] The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. Attached Figure Description
[0071] Figure 1 This is a flowchart of an automatic modulation identification method based on multi-frequency Mamba under federated learning according to the present invention.
[0072] Figure 2 This is a diagram of the FL-MFMamba algorithm architecture according to an embodiment of the present invention;
[0073] Figure 3 This is a comparison chart of the classification performance of different frameworks in this invention under different SNRs when the training dataset is unevenly divided.
[0074] Figure 4 This is a graph showing the convergence performance of each model on the test set under a non-independent and identically distributed data distribution, according to an embodiment of the present invention.
[0075] Figure 5 The image shows the FL-MFMamba classification accuracy of data based on unevenly partitioned data in an embodiment of the present invention, using three data augmentation methods. Detailed Implementation
[0076] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. The components of the embodiments of the present invention described and shown in the accompanying drawings can generally be arranged and designed in various different configurations. In the description of the present invention, it should be noted that the terms "upper," "lower," "inner," and "outer," etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings, or the orientation or positional relationship commonly used when the product is in use. They are only for the convenience of describing the present invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, they should not be construed as limitations on the present invention.
[0077] Example
[0078] like Figure 1 As shown, this invention discloses an automatic modulation identification method based on multi-frequency Mamba under federated learning, the steps of which include:
[0079] S1. Establish a federated training system consisting of a central server and M edge monitoring nodes. The raw modulation signal data of each edge monitoring node is stored and processed locally and is not transmitted over the network. Figure 2 As shown in ②, S1 specifically includes:
[0080] S11. Establish a federated training system consisting of a central server as the global aggregator and M edge monitoring nodes as local trainers;
[0081] S12. The central server initializes the global weights of the multi-frequency Mamba model. ;
[0082] S13. Before training begins, the central server will initialize the weights. Or the global weight after each iteration Broadcast to all selected active edge monitoring nodes to ensure consistent initial state.
[0083] The core advantage of this system design is that the original radio frequency signal is always stored locally at the monitoring node, and only the encrypted model parameters are transmitted in the network. This physically isolates the risk of data leakage, and the amount of parameters transmitted in a single transmission is much smaller than the original I / Q sequence, which significantly reduces the bandwidth usage of the wireless channel.
[0084] S2. Each edge monitoring node acquires time-domain radio frequency signals and converts them into I / Q signal sequences of the original modulated signals, such as... Figure 2 As shown in ①, the Rotation-Flip Joint Enhancement (JAM) method is used to perform a multidimensional geometric transformation on the original modulated signal to expand the local sample set. The Rotation-Flip Joint Enhancement method specifically includes:
[0085] S21. Let the original signal be... The rotated extended signal sample is The transformation formula is:
[0086] ;
[0087] ;
[0088] in, It is the rotation angle;
[0089] The signal is rotated counterclockwise at four angles (simulating channel phase shift). The four angles include... ;
[0090] S22. Perform a flip-over transformation on the modulated signal. Perform horizontal and vertical flips (simulating hardware mirroring interference), and the signal after the flip transformation is: The calculation formula is:
[0091] ;
[0092] in, The in-phase component after the flip transformation. These are the orthogonal components after the flip transformation;
[0093] S23. By combining rotation and flip transformations, the local sample set is expanded. In distributed monitoring scenarios, the data collected by each node often exhibits non-independent and identically distributed characteristics. By expanding the local training set to six times its original size through the above geometric transformations, the robustness of the model to signal phase fluctuations and mirror image changes is significantly enhanced, effectively balancing the sample heterogeneity problem caused by differences in node geographical locations.
[0094] S3. Construct the core identification model of this invention—the multi-frequency Mamba model—and deploy it on edge nodes, executing as follows: Figure 2 The local model training shown in ③ learns frequency domain weights to complete signal denoising, extracts frequency domain representation of the signal, and extracts spatiotemporal features of long sequence signals;
[0095] The model's internal logic deeply integrates digital signal processing and state-space models, and adopts a federated learning-multi-frequency Mamba (FL-MFMamba) algorithm architecture to achieve signal denoising, signal frequency domain representation extraction, and long sequence signal spatiotemporal feature extraction. It includes a built-in spectrum correction module, multi-frequency attention module, and Mamba-2 backbone network.
[0096] like Figure 2 As shown inside MFMamba, the input signal is first transformed to the frequency domain by FFT, then learns an adaptive full-band soft mask through a dual-branch MLP, and finally is restored by iFFT to adaptively filter out background noise.
[0097] The spectrum correction module sequentially performs fast Fourier transform, full-band soft mask correction, and inverse fast Fourier transform on the input time-domain I / Q sequence, learns frequency domain weights to complete signal denoising, and outputs the denoised time-domain I / Q sequence and the corresponding signal frequency domain representation.
[0098] The implementation logic of the spectrum correction module is as follows:
[0099] S31. Use Fast Fourier Transform to convert the length of... Mapping time-domain I / Q sequences to frequency-domain complex-valued sequences:
[0100] ;
[0101] in, These are frequency domain complex coefficients. The length of the time-domain I / Q sequence. The imaginary unit, For frequency index number, For time index number, These are the sampled values of the in-phase components of the time-domain I / Q sequence. These are the sampled values of the orthogonal components of the time-domain I / Q sequence;
[0102] S32, frequency domain complex coefficients Decoupling to real part and the virtual part The inputs are fed into two parallel two-layer perceptron (two-branch MLP) networks respectively;
[0103] S33. Use the Tanh activation function to learn the full-band soft mask correction coefficients, and perform corrections on the real and imaginary parts separately. The correction formula is as follows:
[0104] ;
[0105] in, , These are the mapping functions of two two-layer perceptrons, respectively. for The corrected real part, for The corrected imaginary part, This is an element-wise multiplication operation. This is the core activation function of the spectrum correction module, used to implement frequency domain soft mask adaptive correction;
[0106] S34. Correct the real part and the virtual part Reconstructed into frequency domain complex coefficients The formula is:
[0107] ;
[0108] The time-domain representation signal, restored by inverse fast Fourier transform, is given by the following formula:
[0109] ;
[0110] in, These are the sampled values of the time-domain I / Q sequence after denoising.
[0111] Subsequently, discrete cosine transform is used instead of traditional global average pooling. The physical significance of this module lies in capturing the microscopic transient characteristics of the signal to preserve the high-frequency modulation information that is erased by the traditional model;
[0112] The multi-frequency attention module receives the frequency domain representation of the signal output from the spectrum correction module, slices the frequency domain representation of the signal along the channel dimension, extracts the frequency domain features of each channel using discrete cosine transform, and generates channel attention weight vectors through a bottleneck structure MLP network to enhance key frequency band features and suppress noise interference.
[0113] The implementation logic of the multi-frequency attention module is as follows:
[0114] S35. Slice the feature map, which is composed of the frequency domain representation of the signal output by the spectrum correction module, along the channel dimension to form a set of feature sequences. ,in, Representing the Time series of each channel Total number of channels;
[0115] S36. Using discrete cosine transform instead of global average pooling, extract single-channel frequency domain representations for each channel sequence. Single-channel frequency domain characterization The formula is:
[0116] ;
[0117] in, For the first One-dimensional discrete cosine transform of time series of multiple channels The total number of frequency domain points, For the first The time series of the first channel in the first... The sampled values at each time point;
[0118] Before selection The coefficients, which contain low-frequency and mid-to-high-frequency components, constitute the compressed frequency domain descriptor. It is a positive integer;
[0119] S37. After concatenating the frequency domain representations of each channel, input them into the bottleneck structure MLP network, and generate the channel attention weight vectors using the Sigmoid function. The formula is:
[0120] ;
[0121] in, and These represent the weight matrices for the dimensionality reduction and dimensionality increase layers, respectively. Represents the ReLU function. This represents the Sigmoid function.
[0122] The Mamba-2 backbone network receives sequences enhanced by features from a multi-frequency attention module. Based on a structured state-space dual mechanism, it employs block decomposition technology to achieve parallel training. It completes the spatiotemporal feature extraction of long sequence signals in a recursive mode with constant memory usage, achieving an inference complexity of O(L), where L is the length of the time-domain I / Q sequence.
[0123] The specific implementation logic of the Mamba-2 backbone network is as follows:
[0124] S38. A structured state-space dual mechanism is adopted. The frequency domain feature sequence after being weighted by the channel attention weight vector output by the multi-frequency attention module is received as input. During the training phase, the state equation is recursively transformed into parallel matrix multiplication using block decomposition technology.
[0125] S39. In the inference stage, a recursive mode with constant memory usage is used to extract the spatiotemporal features of long sequence signals. Its inference complexity is O(L), and the core state update and output equations are:
[0126] ;
[0127] ;
[0128] in, for The hidden state vector at each time step represents Spatiotemporal characteristics of the signal at any given moment for The state vector is hidden at all times. The diagonal state transition matrix defined for the structured state-space dual mechanism. For the input projection matrix, To output the projection matrix, for The input feature vector at each time step is the channel attention weight vector passed through S37. The weighted frequency domain descriptor concatenation sequence, for The feature vector is output at each time step, and The spatiotemporal features of the extracted long sequence signal.
[0129] Compared to the Transformer architecture, MFMamba has a single-sample inference time of only 0.12ms when processing high-sampling-rate signals, which is about 65 times faster. Moreover, the total number of parameters is only 65,000, which is about 1 / 5 of that of the CNN model, making it very suitable for deployment in power-sensitive edge monitoring devices.
[0130] S4. The central server initializes and broadcasts global model weights based on the multi-frequency Mamba model. Each edge monitoring node trains its local multi-frequency Mamba model using an expanded sample set and then uploads the weights. The central server aggregates and updates the weights using a weighted model averaging algorithm and feeds back the updated global weights to the edge monitoring nodes.
[0131] S4 specifically includes:
[0132] S41. After the central server initializes and broadcasts the global model weights, each edge monitoring node trains a multi-frequency Mamba model locally using the expanded sample set. The model parameters are updated using mini-batch stochastic gradient descent. Figure 2 As shown in ④, after training is completed, each edge monitoring node only uploads the model weight parameters to the central server;
[0133] For the The edge monitoring node at the ... Round For batch training, the parameter update formula is:
[0134] ;
[0135] in, For learning rate, Indicates the first Calculate the parameters of the multi-frequency Mamba model based on the data from each edge monitoring node. For the first The edge monitoring node at the ... Round The loss function for multi-frequency Mamba models during batch training. loss function For model parameters The gradient;
[0136] S42, such as Figure 2 As shown in ⑤, the central server receives parameters uploaded by all edge monitoring nodes, uses a weighted average algorithm to aggregate parameters based on the sample size ratio of each node, and generates a new round of global weights with more balanced and robust performance.
[0137] The polymerization process is represented as:
[0138] ;
[0139] in, The new global model weights are obtained after aggregation by the central server. The total number of edge monitoring nodes participating in the training. For the first Number of samples in the local augmented sample set of each edge monitoring node. The total number of samples in the local augmented sample set for all edge monitoring nodes. For the first The edge monitoring node at the ... The parameters of the multi-frequency Mamba model uploaded after each round of training;
[0140] This strategy achieves "knowledge sharing" among distributed nodes through parameter-level fusion, enabling the model to learn signal characteristics under different channel environments across regions, significantly improving the generalization performance of the global model.
[0141] S5. Repeat step S4 until the global model recognition accuracy tends to stabilize and the global model after synchronous convergence of each edge monitoring node converts the real-time collected time-domain radio frequency signal into an I / Q signal sequence and then performs automatic modulation mode recognition.
[0142] S5 specifically includes:
[0143] The central server distributes the aggregated global weights back to each edge monitoring node, as shown in ⑥ in 2. Each edge monitoring node executes ⑦ to complete the local model update and enters the next iteration.
[0144] The federated training process of step S4 is repeated until the modulation recognition accuracy of the global multi-frequency Mamba model tends to stabilize. The parameters of the global multi-frequency Mamba model after synchronous convergence of each edge monitoring node are used to convert the real-time acquired time-domain radio frequency signal into the I / Q signal sequence of the original modulation signal. Then, the modulation mode is automatically identified in the I / Q signal sequence to achieve classification and prediction of the radio frequency signal modulation type.
[0145] Example 1
[0146] To verify the performance of the automatic modulation identification method based on multi-frequency Mamba under federated learning disclosed in this invention, this embodiment performs simulation verification on the public dataset RadioML2016.10a. This dataset contains 11 modulation schemes with a signal-to-noise ratio (SNR) ranging from -20dB to 18dB. The experiment uses four distributed edge monitoring nodes and one central server to simulate a heterogeneous data scenario that is not independent and identically distributed.
[0147] Analysis of recognition accuracy under different signal-to-noise ratios:
[0148] like Figure 3 As shown, the method of this invention (FL-MFMamba combined with JAM) and the federated benchmark models (FL-CNN, FL-LSTM, FL-Transformer) are compared in terms of correct classification probabilities under different signal-to-noise ratios.
[0149] Experimental results show that the curve of the method of this invention is always at the top across the entire signal-to-noise ratio range. Specifically, at an SNR of 4dB, the recognition accuracy of the method of this invention reaches 91.76%, significantly higher than the control group without JAM enhancement (75.03%); at an SNR of 12dB, the highest recognition rate of the method of this invention reaches 92.54%, while the highest recognition rate of FL-CNN under the same federated architecture is only 54.19%.
[0150] It can be seen that the present invention effectively suppresses noise interference under low signal-to-noise ratio by combining spectral correction and multi-frequency attention, and significantly improves the classification upper limit.
[0151] Convergence performance analysis of federated learning:
[0152] like Figure 4 As shown, the curves of the accuracy of each model on the test set as a function of communication rounds are displayed under a non-independent and identically distributed data distribution.
[0153] As can be observed from the convergence curves, the FL-MFMamba proposed in this invention exhibits the fastest convergence speed, achieving an accuracy exceeding 89% and entering a plateau around the 40th communication round. In contrast, FL-Transformer requires approximately 60 rounds to converge, while FL-LSTM and FL-CNN achieve significantly lower accuracy at the same number of rounds.
[0154] The results show that the Mamba model, with its linear complexity selective scanning mechanism, can more efficiently capture the global features of heterogeneous signals during federated weight aggregation, significantly reducing the communication cost of distributed training.
[0155] Validation of the effectiveness of data augmentation methods:
[0156] like Figure 5 As shown, ablation comparison experiments are presented under the FL-MFMamba framework using different enhancement strategies: JAM, rotation transformation, and flip transformation.
[0157] The results show that the JAM strategy outperforms single rotation or flip enhancement across the entire signal-to-noise ratio. In particular, when the SNR is greater than 0 dB, the JAM strategy is 0.87%–2.36% more accurate than the single rotation method and 1.84%–4.04% more accurate than the flip method.
[0158] It can be seen that the JAM method expands the sample space through multidimensional geometric transformation, enabling the model to learn the phase invariance representation of the signal, effectively overcoming the problem of insufficient generalization caused by the lack of data in distributed nodes.
[0159] Therefore, this invention adopts the above-mentioned automatic modulation recognition method based on multi-frequency Mamba under federated learning. By enhancing and expanding samples through JAM, extracting long sequence features through a lightweight MFMamba model, and protecting privacy through federated weighted aggregation, this invention systematically solves the bottlenecks of existing technologies such as scarce samples, insufficient representation, difficult deployment, and lack of privacy, and achieves efficient and accurate blind recognition in complex environments.
[0160] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can still be made to the technical solutions of the present invention, and these modifications or equivalent substitutions cannot cause the modified technical solutions to deviate from the spirit and scope of the technical solutions of the present invention.
Claims
1. An automatic modulation identification method based on multi-frequency Mamba under federated learning, characterized in that the steps are as follows: include: S1. Build a federated training system consisting of a central server and M edge monitoring nodes. The raw modulation signal data of each edge monitoring node is stored and processed locally and is not transmitted over the network. S1 specifically includes: S11. Establish a federated training system consisting of a central server as the global aggregator and M edge monitoring nodes as local trainers; S12. The central server initializes the global weights of the multi-frequency Mamba model. ; S13. Before training begins, the central server will initialize the weights. Or the global weight after each iteration Broadcast to all selected active edge monitoring nodes to ensure consistent initial state; S2. Each edge monitoring node acquires time-domain radio frequency signals and converts them into I / Q signal sequences of the original modulation signals. A rotation-flip joint enhancement method is used to perform multidimensional geometric transformation on the original modulation signals to expand the local sample set. S3. Construct a multi-frequency Mamba model and deploy it on edge nodes. Learn frequency domain weights to complete signal denoising, extract the frequency domain representation of the signal, and extract the spatiotemporal features of long sequence signals. The multi-frequency Mamba model built in S3 adopts a federated learning-multi-frequency Mamba algorithm architecture to realize signal denoising, signal frequency domain representation extraction and long sequence signal spatiotemporal feature extraction. It includes a built-in spectrum correction module, multi-frequency attention module and Mamba-2 backbone network. The spectrum correction module sequentially performs fast Fourier transform, full-band soft mask correction, and inverse fast Fourier transform on the input time-domain I / Q sequence, learns frequency domain weights to complete signal denoising, and outputs the denoised time-domain I / Q sequence and the corresponding signal frequency domain representation. The multi-frequency attention module receives the frequency domain representation of the signal output from the spectrum correction module, slices the frequency domain representation of the signal along the channel dimension, extracts the frequency domain features of each channel using discrete cosine transform, and generates channel attention weight vectors through a bottleneck structure MLP network to enhance key frequency band features and suppress noise interference. The Mamba-2 backbone network receives sequences enhanced by features from the multi-frequency attention module. Based on the structured state-space duality mechanism, it uses block decomposition technology to achieve parallel training and completes the spatiotemporal feature extraction of long sequence signals in a recursive mode with constant memory usage. The inference complexity is O(L), where L is the length of the time-domain I / Q sequence. S4. The central server initializes and broadcasts global model weights based on the multi-frequency Mamba model. Each edge monitoring node trains its local multi-frequency Mamba model using an expanded sample set and then uploads the weights. The central server aggregates and updates the weights using a weighted model averaging algorithm and feeds back the updated global weights to the edge monitoring nodes. S5. Repeat step S4 until the global model recognition accuracy tends to stabilize and the global model after synchronous convergence of each edge monitoring node converts the real-time collected time-domain radio frequency signal into an I / Q signal sequence and then performs automatic modulation mode recognition.
2. The automatic modulation identification method based on multi-frequency Mamba under federated learning according to claim 1, characterized in that, The rotation-flip joint enhancement method in S2 specifically includes: S21. Let the original signal be... The rotated extended signal sample is The transformation formula is: ; ; in, It is the rotation angle; Select Rotate the signal counterclockwise at one angle; S22. Perform a flip-over transformation on the modulated signal. Perform horizontal and vertical flips; the signal after the flip transformation is: The calculation formula is: ; in, The in-phase component after the flip transformation. These are the orthogonal components after the flip transformation; S23. The local sample set is expanded by combining rotation and flip transformations.
3. The automatic modulation identification method based on multi-frequency Mamba under federated learning according to claim 1, characterized in that: The implementation logic of the spectrum correction module in S3 is as follows: S31. Use Fast Fourier Transform to convert the length of... Mapping time-domain I / Q sequences to frequency-domain complex-valued sequences: ; in, These are frequency domain complex coefficients. The length of the time-domain I / Q sequence. The imaginary unit, For frequency index number, For time index number, These are the sampled values of the in-phase components of the time-domain I / Q sequence. These are the sampled values of the orthogonal components of the time-domain I / Q sequence; S32, frequency domain complex coefficients Decoupling to real part and the virtual part The inputs are fed into two parallel two-layer perceptron networks respectively; S33. Use the Tanh activation function to learn the full-band soft mask correction coefficients, and perform corrections on the real and imaginary parts separately. The correction formula is: ; in, , These are the mapping functions of two two-layer perceptrons, respectively. for The corrected real part, for The corrected imaginary part, This is an element-wise multiplication operation. This is the core activation function of the spectrum correction module, used to implement frequency domain soft mask adaptive correction; S34. Correct the real part and the virtual part Reconstructed into frequency domain complex coefficients The formula is: ; The time-domain representation signal, restored by inverse fast Fourier transform, is given by the following formula: ; in, These are the sampled values of the time-domain I / Q sequence after denoising.
4. The automatic modulation identification method based on multi-frequency Mamba under federated learning according to claim 3, characterized in that: The specific implementation logic of the multi-frequency attention module in S3 is as follows: S35. Slice the feature map, which is composed of the frequency domain representation of the signal output by the spectrum correction module, along the channel dimension to form a set of feature sequences. ,in, Representing the Time series of each channel Total number of channels; S36. Using discrete cosine transform instead of global average pooling, extract single-channel frequency domain representations for each channel sequence. Single-channel frequency domain characterization The formula is: ; in, For the first One-dimensional discrete cosine transform of time series of multiple channels The total number of frequency domain points, For the first The time series of the first channel in the first... The sampled values at each time point; Before selection The coefficients, which contain low-frequency and mid-to-high-frequency components, constitute the compressed frequency domain descriptor. It is a positive integer; S37. After concatenating the frequency domain representations of each channel, input them into the bottleneck structure MLP network, and generate the channel attention weight vectors using the Sigmoid function. The formula is: ; in, and These represent the weight matrices for the dimensionality reduction and dimensionality increase layers, respectively. Represents the ReLU function. This represents the Sigmoid function.
5. The automatic modulation identification method based on multi-frequency Mamba under federated learning according to claim 4, characterized in that: The specific implementation logic of the Mamba-2 backbone network in S3 is as follows: S38. A structured state-space dual mechanism is adopted. The frequency domain feature sequence after being weighted by the channel attention weight vector output by the multi-frequency attention module is received as input. During the training phase, the state equation is recursively transformed into parallel matrix multiplication using block decomposition technology. S39. In the inference stage, a recursive mode with constant memory usage is used to extract the spatiotemporal features of long sequence signals. Its inference complexity is O(L), and the core state update and output equations are: ; ; in, for The hidden state vector at each time step represents Spatiotemporal characteristics of the signal at any given moment for The state vector is hidden at all times. The diagonal state transition matrix defined for the structured state-space dual mechanism. For the input projection matrix, To output the projection matrix, for The input feature vector at each time step is the channel attention weight vector passed through S37. The weighted frequency domain descriptor concatenation sequence, for The feature vector is output at each time step, and The spatiotemporal features of the extracted long sequence signal.
6. The automatic modulation identification method based on multi-frequency Mamba under federated learning according to claim 1, characterized in that, S4 specifically includes: S41. After the central server initializes and broadcasts the global model weights, each edge monitoring node trains a multi-frequency Mamba model locally using the expanded sample set, and the model parameters are updated through mini-batch stochastic gradient descent. For the The edge monitoring node at the ... Round For batch training, the parameter update formula is: ; in, For learning rate, Indicates the first Calculate the parameters of the multi-frequency Mamba model based on the data from each edge monitoring node. For the first The edge monitoring node at the ... Round The loss function for multi-frequency Mamba models during batch training. For loss function For model parameters The gradient; S42. The central server is responsible for aggregating client parameters using a weighted average algorithm. The aggregation process is represented as follows: ; in, The new global model weights are obtained after aggregation by the central server. The total number of edge monitoring nodes participating in the training. For the first Number of samples in the local augmented sample set of each edge monitoring node. The total number of samples in the local augmented sample set for all edge monitoring nodes. For the first The edge monitoring node at the ... The parameters of the multi-frequency Mamba model are uploaded after the training round is completed.
7. The automatic modulation identification method based on multi-frequency Mamba under federated learning according to claim 1, characterized in that, S5 specifically includes: The federated training process of step S4 is repeated until the modulation recognition accuracy of the global multi-frequency Mamba model tends to stabilize. The parameters of the global multi-frequency Mamba model after synchronous convergence of each edge monitoring node are used to convert the real-time acquired time-domain radio frequency signal into the I / Q signal sequence of the original modulation signal. Then, the modulation mode is automatically identified in the I / Q signal sequence to achieve classification and prediction of the radio frequency signal modulation type.