A eavesdropping detection method based on transformer feature representation and communication signal enhancement
By performing spectrum analysis and Transformer feature characterization on the radio frequency observation signals of legitimate receivers, a whitelist prototype library was constructed, which solved the problem of identifying unknown and illegal receivers in open scenarios, and achieved accurate detection and rejection of unknown devices, thereby improving the security and robustness of wireless communication.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SOUTHEAST UNIV
- Filing Date
- 2026-05-21
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies struggle to effectively identify and detect unknown, unauthorized receivers in open environments, especially in IoT and unmanned systems. Traditional methods are often unable to directly detect unauthorized receiving devices, and existing methods are primarily geared towards closed-set identification scenarios, lacking sufficient ability to reject unknown devices.
By performing spectral analysis on the radio frequency observation signals of registered legitimate receivers, local spectral features of local oscillator leakage are extracted. A training sample subset is constructed and trained using a Transformer feature representation model. A prototype library of legitimate receiver whitelists is built. The feature representation model is then used to perform similarity matching and threshold discrimination on the test samples to identify potential illegal receivers.
It improves the accuracy and robustness of detecting unauthorized receivers in open environments, can identify unknown receiving devices, and enhances the security and reliability of wireless communication environments.
Smart Images

Figure CN122248416A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of wireless communication security, and specifically to an eavesdropping detection method based on Transformer feature representation and communication signal enhancement. Background Technology
[0002] With the continuous development and widespread application of wireless communication technology, the security protection of information transmission links in open wireless environments has become increasingly prominent. Especially in the Internet of Things (IoT), unmanned systems, and other high-security communication scenarios, the risk of unauthorized receiving devices acquiring communication information through passive reception is constantly increasing. Unlike active interference, unauthorized receivers typically do not actively emit detection signals, possessing strong concealment and making them difficult to detect directly using traditional link monitoring methods. Therefore, how to effectively identify potential unauthorized receivers in complex open environments has become an urgent problem to be solved in the field of wireless communication security.
[0003] Existing physical layer security research primarily mitigates eavesdropping risks through methods such as encryption coding, artificial noise injection, and security rate optimization. These methods mainly focus on improving the anti-eavesdropping capabilities of legitimate links, but they typically assume the location and channel state information of the eavesdropper are known, or only indirectly assess the eavesdropping risk, making it difficult to directly detect unauthorized receiving devices. On the other hand, radio frequency fingerprinting technology utilizes the non-ideal characteristics of device hardware for identity verification, providing a new approach to detecting unauthorized receivers. However, existing methods are mostly geared towards closed-set identification scenarios, where the device categories in the training and test sets are largely the same. When unregistered or unknown-model receiving devices appear in real-world scenarios, insufficient rejection capability can easily occur, making it difficult to meet the actual needs of detecting unknown devices in open environments.
[0004] Local oscillator leakage originates from suboptimal mixer port isolation and circuit board-level spatial coupling. Under certain conditions, it can characterize hardware differences in devices, providing usable observational features for identifying unauthorized receivers. Current research on local oscillator leakage primarily focuses on leakage suppression, lacking a method for identifying unauthorized receivers that uses a legitimate device whitelist built upon local oscillator leakage as a reference and performs rejection checks on unknown receiving devices. Therefore, a novel technical solution is urgently needed to improve the ability to identify and detect potentially unauthorized receivers in wireless communication scenarios. Summary of the Invention
[0005] Purpose of the invention: This invention addresses the problems of existing illegal receiver detection methods being unable to adapt to open scenarios and lacking the ability to identify unknown receiving devices outside the whitelist. It provides an eavesdropping detection method based on Transformer feature representation and communication signal enhancement, which can improve the accuracy and robustness of illegal receiver detection.
[0006] Technical Solution: To achieve the above-mentioned objectives, this invention provides an eavesdropping detection method based on Transformer feature representation and communication signal enhancement, comprising the following steps:
[0007] Step 1: When the legitimate communication link is in normal working condition, collect radio frequency observation signals containing local oscillator leakage components for registered legitimate receivers under different working conditions and environmental conditions to form a legitimate receiver observation sample set;
[0008] Step 2: Perform spectrum analysis and preprocessing on the legal receiver observation sample set, extract the local spectrum features corresponding to the local oscillator leakage, and construct a training sample subset and a reference sample subset;
[0009] Step 3: Train the Transformer-based feature representation model based on the training sample subset. The feature representation model is used to extract the feature vector of the local oscillator leakage spectrum of the receiver.
[0010] Step 4: Input the reference prototype samples used for prototype construction from the reference sample subset into the trained feature representation model to obtain the corresponding feature vectors, and construct the prototype feature library according to the legitimate receiver category;
[0011] Step 5: Perform the same preprocessing and local feature extraction as in Step 2 on the radio frequency observation signal to be tested, obtain the sample to be tested, and input it into the feature representation model to obtain the feature vector of the sample to be tested;
[0012] Step 6: Perform similarity matching between the feature vector of the sample to be tested and the features of each legitimate receiver prototype in the prototype feature library to obtain the matching score of each category, as well as the best matching category and the second best matching category. Construct the legality judgment score of the sample to be tested using the score difference between the best matching category and the second best matching category.
[0013] Step 7: Perform eavesdropping detection on the sample to be tested based on the matching results: When the legality discrimination score of the sample to be tested does not meet the discrimination threshold condition, the sample to be tested is judged as a potential illegal receiver and a detection alarm is output.
[0014] Furthermore, in step 1, by changing the operating state and environmental conditions between the receiver and the transmitter, multiple observation windows are collected to form observation samples corresponding to each legal receiver category; the changes in operating state include changes in terminal orientation and temperature, and the changes in environmental conditions include changes in relative distance and obstruction; the collected radio frequency observation signals are represented as follows: ;in, Indicates time, represents the communication signal component, This indicates the leakage component of the local oscillator. The disturbance term is represented as follows: The legitimate receiver observation sample set is represented as: ;in, This represents the discrete observation sequence corresponding to the i-th sampling window, containing a fixed number of sampling points. It is obtained by discretizing the continuous-time observation signal according to a preset sampling period. This represents the valid receiver category identifier corresponding to the i-th sampling window. This represents the total number of samples observed in the whitelist.
[0015] Further, step 2 involves performing spectral analysis and preprocessing on the legitimate receiver observation sample set to extract the local spectral features corresponding to the local oscillator leakage, specifically including:
[0016] For each discrete observation sequence in the legal receiver observation sample set, after DC removal and windowing processing, a spectrum transformation is performed to obtain the corresponding frequency domain representation and amplitude spectrum. The frequency position corresponding to each spectrum point is determined by combining the sampling frequency.
[0017] Determine the leakage center frequency, extract a local frequency band of a preset width around the leakage center frequency, and perform fixed-length interpolation resampling on the local spectrum according to the target frequency grid to obtain a fixed-length local spectrum with a unified dimension;
[0018] Logarithmic compression, dynamic range pruning, and normalization are performed on the fixed-length local spectrum to obtain the local spectral feature vector corresponding to the discrete observation sequence.
[0019] Furthermore, step 2 involves constructing a subset of training samples and a subset of reference samples, specifically including:
[0020] Constructing a local spectral feature sample set for legitimate receivers: ;in, This represents the local spectral feature vector of the local oscillator leakage corresponding to the i-th sample. This represents the valid receiver category identifier corresponding to the i-th sample. This indicates the total number of samples observed in the whitelist;
[0021] Local spectrum feature sample set of legitimate receivers Divided into training sample subsets and reference sample subset ;in, and Let represent the training sample index set and the reference sample index set, respectively, and satisfy the following: The training sample subset and reference sample subset All samples were collected under different categories of legal receivers, as well as under different working conditions and environmental conditions.
[0022] Further, step 3 involves training the Transformer-based feature representation model based on the aforementioned subset of training samples, specifically including:
[0023] The local feature spectrum feature vectors of the training samples are arranged into an input sequence according to frequency order. After embedding mapping and position encoding, and adding a global representation label vector at the beginning of the sequence, the sequence is fed into a feature representation network composed of multiple stacked Transformer encoders.
[0024] The output corresponding to the global representation label is extracted, and nonlinear projection and normalization are performed to obtain the embedding representation of the training sample in the feature space; where the embedding representation of the i-th training sample in the feature space is: ;in, This represents the local spectral feature vector of the local oscillator leakage corresponding to the i-th training sample. Represents the set of training sample indices. Representation of feature representation model, The trainable parameters of the feature representation model are represented; a cosine similarity classification head is introduced during the training phase to calculate the category output score; wherein, the classification output of the i-th training sample for the c-th class is represented as: ;in, This represents the class weight vector corresponding to the trainable c-th class. Indicates the scaling factor. The superscript represents the classification score of the i-th training sample belonging to the c-th class of legitimate receivers. Indicates transpose. Represents the 2-norm of a vector;
[0025] Based on the class labels of a subset of training samples, the cross-entropy loss is defined as: ;in, This represents the number of training samples, where C is the total number of valid receiver classes, and k is the class index. This represents the category identifier of the legitimate receiver corresponding to the i-th training sample. This indicates that the i-th training sample belongs to the i-th training sample. Class classification score, This represents the classification score of the i-th training sample belonging to the k-th class;
[0026] For each type of legitimate receiver, a feature center is constructed, and the first feature center in the training sample subset is... The feature centers of a class sample are represented as follows: ;in, Let c be the number of training samples of class c. Indicates the first The center position of a class sample in the feature space;
[0027] Construct an intra-class compactness loss to constrain similar samples to cluster towards its center, expressed as: ;
[0028] The weighted combination of cross-entropy loss and intra-class compaction loss yields the overall training objective function: ;in, These are the weighting coefficients;
[0029] By adjusting the overall training objective function Perform iterative optimization to obtain the model parameters after training. After the model training is complete, the cosine similarity classification head used for supervised training is removed to obtain the final feature representation model. .
[0030] Furthermore, step 4 involves constructing a prototype feature library according to the legitimate receiver category, specifically including:
[0031] Input the local spectral feature vectors of each reference prototype sample in the reference sample subset into the feature representation model obtained in step 3 to obtain the corresponding reference feature vectors;
[0032] For each class of legitimate receivers, collect all its reference feature vectors; construct the prototype features of the c-th class of legitimate receivers using intra-class mean aggregation, and the prototype feature vector is represented as: ;in, This represents the reference feature vector corresponding to the i-th reference sample. This represents the number of reference samples for the c-th type of legitimate receiver. This represents the set of reference feature vectors corresponding to the c-th type of legitimate receiver;
[0033] For prototype feature vectors Normalization is performed: ;in, This represents the normalized prototype features of the c-th class of legitimate receivers.
[0034] Furthermore, the reference feature vector corresponding to the c-th type of legitimate receiver is divided into... Each reference feature vector belongs to only one sub-cluster, and the sub-clusters do not overlap. The union of the subclusters constitutes the set of all reference features of class c;
[0035] The k-th sub-cluster of the c-th class of legal receivers is constructed using intra-class mean aggregation to create its corresponding sub-prototype feature vector, which is represented as: ;in, This represents the set of reference feature vectors corresponding to the k-th sub-cluster of the c-th class of legal receivers. This represents the number of feature vectors within the k-th sub-cluster of the c-th type of legal receiver;
[0036] right After normalization, we get: .
[0037] Furthermore, in step 6, cosine similarity is used to calculate the degree of closeness between the feature vector of the sample to be tested and the features of each legitimate receiver prototype;
[0038] When multiple sub-prototypes correspond to the same legitimate receiver category, the maximum value of the similarity results of multiple sub-prototypes within the same category or the average of the Top-N similarities is taken as the matching score for that category, where N≥2;
[0039] Based on the category matching score vector, determine the optimal matching category for the test sample: ;in, This represents the matching score of the c-th class of legitimate receivers, where C represents the total number of legitimate receiver classes.
[0040] A validity score is constructed using the score difference between the best and second-best matching categories. The second-largest score in the category matching score is represented as: ;
[0041] The validity score of the sample to be tested is defined as: ;in, For the highest category matching score, For the weighting coefficients, satisfying .
[0042] Furthermore, the legality judgment condition in step 7 is: ;in, The score represents the validity score of the sample to be tested. The legality threshold is determined based on the legality score distribution of the reference calibration samples used for threshold calibration in the reference sample subset. If the sample to be tested meets the legality criteria, it is determined that the sample to be tested belongs to the best-matched legal receiver category; otherwise, it is determined that it does not belong to any registered legal receiver category and is marked as a potential illegal receiver.
[0043] The present invention also provides a computer system, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the computer program is executed by the processor, it implements the steps of the eavesdropping detection method based on Transformer feature representation and communication signal enhancement.
[0044] Beneficial Effects: Compared with existing technologies, this invention has the following advantages: This invention obtains corresponding local spectral feature vectors by performing spectral analysis and feature extraction on observation samples from registered legitimate receivers. Based on these local spectral feature vectors, a prototype library of a legitimate receiver whitelist is constructed. Furthermore, by combining this library with Transformer to perform representation learning on the spectral features, the legitimate category identification of the receiving device under test and the rejection of unknown receiving devices are achieved. In addition, this invention collects observation samples when the legitimate communication link is in normal working condition. This ensures that, in addition to local oscillator leakage components and noise components, the samples also retain communication components reflecting the current communication service status, propagation environment, and background interference conditions, thereby improving the model's adaptability and generalization ability to different scenarios. This invention fully utilizes the differences in local oscillator leakage spectral shapes among different receiving devices, solving the problem of insufficient distinguishing ability of existing closed-set identification methods when unknown receiving devices are present. This invention constructs a whitelist prototype library using registered legitimate receiver reference samples. The Transformer spectral representation of the sample to be tested is then matched with the whitelist prototypes for similarity and threshold discrimination. When the sample to be tested does not meet the matching criteria with any legitimate receiver prototype, it is determined to be an illegitimate receiver, thus achieving open set identification. Therefore, this invention eliminates the need to pre-acquire illegitimate receiver category samples to determine the legitimacy of unknown receiving devices, which improves the accuracy and robustness of receiving device legitimacy determination under open set conditions, thereby enhancing the security and reliability of illegitimate receiver identification in wireless communication environments. Attached Figure Description
[0045] Figure 1 This is a flowchart illustrating the overall process of an embodiment of the present invention.
[0046] Figure 2 This is a ROC curve diagram from an embodiment of the present invention;
[0047] Figure 3 This is a bar chart showing the key performance indicators in the embodiments of the present invention;
[0048] Figure 4 This is a schematic diagram of the distribution of discrimination scores and discrimination thresholds between legitimate and illegitimate receivers in an embodiment of the present invention. Detailed Implementation
[0049] The present invention will be further described below with reference to the accompanying drawings and specific embodiments.
[0050] like Figure 1 As shown in the figure, an eavesdropping detection method based on Transformer feature representation and communication signal enhancement disclosed in this invention includes the following steps:
[0051] Step 1: When the legitimate communication link is in normal working condition, collect radio frequency observation signals containing local oscillator leakage components for registered legitimate receivers under different working conditions and environmental conditions to form a legitimate receiver observation sample set;
[0052] Step 2: Perform spectrum analysis and preprocessing on the legal receiver observation sample set, extract the local spectrum features corresponding to the local oscillator leakage, and construct a training sample subset and a reference sample subset;
[0053] Step 3: Train the Transformer-based feature representation model based on the training sample subset. The feature representation model is used to extract the feature vector of the local oscillator leakage spectrum of the receiver.
[0054] Step 4: Input the reference prototype samples used for prototype construction from the reference sample subset into the trained feature representation model to obtain the corresponding feature vectors, and construct the prototype feature library according to the legitimate receiver category;
[0055] Step 5: Perform the same preprocessing and local feature extraction as in Step 2 on the radio frequency observation signal to be tested, obtain the sample to be tested, and input it into the feature representation model to obtain the feature vector of the sample to be tested;
[0056] Step 6: Perform similarity matching between the feature vector of the sample to be tested and the features of each legitimate receiver prototype in the prototype feature library to obtain the matching score of each category, as well as the best matching category and the second best matching category. Construct the legality judgment score of the sample to be tested using the score difference between the best matching category and the second best matching category.
[0057] Step 7: Perform eavesdropping detection on the sample to be tested based on the matching results: When the legality discrimination score of the sample to be tested does not meet the discrimination threshold condition, the sample to be tested is judged as a potential illegal receiver and a detection alarm is output.
[0058] In step 1, when the legitimate communication link is in normal working condition, radio frequency observation signals containing local oscillator leakage components are collected for several registered legitimate receivers under different operating states and environmental conditions to form a legitimate receiver observation sample set. The details are as follows:
[0059] When the legitimate communication link is functioning normally, signals within the target frequency band are sampled through monitoring nodes. To establish a legitimate receiver observation sample set, several registered legitimate receivers are sampled one by one according to their legitimate receiver category. During each acquisition process, only one legitimate receiver is turned on, while the rest remain off. By changing environmental conditions such as changes in the relative distance between the receiver and the transmitter, changes in obstruction, and changes in terminal orientation and temperature, multiple observation windows are collected to form observation samples corresponding to each legitimate receiver category.
[0060] During the aforementioned data acquisition process, the observation signals received by the monitoring nodes include not only the local oscillator leakage component of the currently active legitimate receiver, but also the communication components corresponding to the actual transmission state of the legitimate communication link, as well as noise, spurious signals, and environmental disturbance components. Since these communication components reflect the current communication service status, propagation environment, and background interference conditions, the adaptability and generalization ability of subsequent models to different environmental disturbances and scene changes are improved. The received observation signals are represented as:
[0061] (1)
[0062] in, Indicates time, represents the communication signal component, This indicates the leakage component of the local oscillator. This represents disturbance terms such as noise and stray noise.
[0063] Next, the continuous-time observation signals are discretized according to a preset sampling period to obtain discrete observation sequences corresponding to a single observation window, where each observation window contains a fixed number of sampling points. The discrete observation sequences obtained from each acquisition are then bound to the corresponding identifier of the currently active legitimate receiver category, thereby obtaining the observation sample set of the legitimate receiver. :
[0064] (2)
[0065] in, This represents the discrete observation sequence corresponding to the i-th sampling window. This represents the valid receiver category identifier corresponding to the i-th sampling window. This represents the total number of samples observed in the whitelist.
[0066] In step 2, the valid receiver observation sample set obtained in step 1 is processed. Spectral analysis and preprocessing are performed to extract local spectral features related to local oscillator leakage from each discrete observation sequence, and a training sample subset is constructed based on these features. and reference sample subset The details are as follows:
[0067] The legal receiver observation sample set obtained above Take any discrete observation sequence. To suppress the influence of the DC component on subsequent spectral analysis, the discrete sequence needs to be de-DC processed. Next, to reduce spectral leakage caused by finite-length sampling, the de-DC signal is multiplied by a window function, specifically the Hanning window. After the above preprocessing, the windowed discrete sequence is subjected to spectral transformation to obtain its corresponding frequency domain representation and amplitude spectrum. The frequency position of each spectral point is then determined by combining this with the sampling frequency.
[0068] It is known that local oscillator leakage components typically manifest as narrow-band spectral peaks within the target monitoring frequency band, while communication signals usually exhibit a wider bandwidth spectral envelope. Therefore, after obtaining the frequency domain representation, the leakage center frequency is first determined; then, a local frequency band of a preset width is truncated around this leakage center frequency, thereby extracting local amplitude-frequency samples near the center frequency from the full spectrum. In practical applications, considering that communication signals are usually broadband signals, while local oscillator leakage typically manifests as narrow-band spikes near a certain frequency point, the specific determination of the leakage center frequency can consider the narrow-band nature, peak significance, and frequency stability of the candidate peaks. For example, by scanning the target frequency band, peaks can be identified from the spectrum as narrow-band, prominent (exceeding a preset threshold), or stably occurring in multiple observation windows (e.g., more than 3 frames). In simulation scenarios, the leakage center frequency can be determined based on the simulation values of the actual scenario.
[0069] To facilitate further processing by the subsequent feature representation model, it is necessary to ensure that the local spectra obtained from different sampling windows, different devices, and different times have a uniform dimension. Therefore, it is necessary to perform fixed-length interpolation resampling on the local spectrum according to the target frequency grid. Linear interpolation resampling is used in this case.
[0070] After obtaining the fixed-length local spectrum, logarithmic compression, dynamic range clipping, and normalization are performed to highlight the relative spectral structure of the local oscillator leakage within the local frequency band and to mitigate the adverse effects of absolute power fluctuations on device identification. These processes preserve the main peak and neighborhood spectral information while suppressing the adverse effects of excessively low spectral values on subsequent feature representation. Finally, the local spectral feature vector corresponding to this discrete observation sequence is obtained.
[0071] After sequentially performing the above-described spectral analysis and preprocessing on all legitimate receiver observation samples, a local spectral feature sample set for legitimate receivers can be constructed. :
[0072] (3)
[0073] in, Let represent the local spectral feature vector of the local oscillator leakage corresponding to the i-th sample.
[0074] Based on this, the local spectrum feature sample set of legitimate receivers The dataset is divided into a training sample subset and a reference sample subset, which are represented as follows:
[0075] (4)
[0076] (5)
[0077] in, and Let represent the training sample index set and the reference sample index set, respectively, and satisfy the following:
[0078] (6)
[0079] training sample subset Reference sample subset used for training subsequent feature representation models This can be further divided into a prototype construction subset and a threshold calibration subset. The prototype construction subset is used to build a prototype feature library for legitimate receivers, while the threshold calibration subset is used to estimate the distribution of legitimacy discrimination scores and determine the legitimacy discrimination threshold. To ensure the stability of subsequent recognition and matching decisions, the training sample subset... and reference sample subset All samples were collected under different categories of legal receivers, as well as under different working conditions and environmental conditions.
[0080] In step 3, based on a subset of training samples Supervised training of the Transformer-based feature representation model yields a feature representation model for characterizing the local oscillator leakage spectrum features of legitimate receivers. The details are as follows:
[0081] Local spectral features contain not only peak positions but also structural information such as main lobe width, shoulder distribution, and neighborhood spectral shape. Therefore, the local spectral feature vectors are first arranged into an input sequence according to frequency order. Next, the sequence units are mapped to a high-dimensional feature space, and embedding mapping is performed on each sequence unit. Through embedding mapping, the original local spectral sequence can be converted into a feature sequence adapted for Transformer encoding. To preserve the positional information of the local spectrum on the frequency axis, positional encoding is introduced into the embedding sequence. Simultaneously, a global representation marker vector is added before the input sequence to aggregate the global information of the entire local spectral sample. Then, the input sequence is fed into a feature representation network composed of several stacked Transformer encoders. Multi-head self-attention and feedforward neural networks can be used to model the correlation between different frequency positions of the local spectrum, extracting deep features representing the leakage spectral shape of the legitimate receiver's local oscillator. Finally, the output corresponding to the global representation marker in the final output sequence is extracted, and nonlinear projection and normalization are performed to obtain the embedding representation of the i-th training sample in the feature space.
[0082] (7)
[0083] in, This represents the local spectral feature vector of the local oscillator leakage corresponding to the i-th training sample. Representation of feature representation model, The trainable parameters of the feature representation model are represented.
[0084] To enable the model to learn distinguishable features among different legitimate receivers and improve feature consistency of the same legitimate receiver under different operating states and environmental conditions, a cosine similarity classification head is introduced during the training phase to calculate the class output score. At this point, the classification output of the i-th training sample for the c-th class can be expressed as:
[0085] (8)
[0086] in, This represents the class weight vector corresponding to the trainable c-th class. Indicates the scaling factor. The superscript represents the classification score of the i-th training sample belonging to the c-th class of legitimate receivers. Indicates transpose. This represents the 2-norm of a vector.
[0087] Based on the class labels of a subset of training samples, the cross-entropy loss is defined as:
[0088] (9)
[0089] in, This represents the number of training samples, where C is the total number of valid receiver classes, and k is the class index. This represents the category identifier of the legitimate receiver corresponding to the i-th training sample. This indicates that the i-th training sample belongs to the i-th training sample. Class classification score, This represents the classification score of the i-th training sample belonging to the k-th class.
[0090] To further enhance the clustering of similar legitimate receiver samples in the feature space and reduce intra-class discrepancies caused by different operating states and environmental conditions, intra-class compactness constraints can be introduced. That is, feature centers are constructed for each class of legitimate receivers. (The training sample set contains...) The feature centers of a class sample can be represented as:
[0091] (10)
[0092] in, Let c be the number of training samples of class c. This represents the center position of the c-th class sample in the feature space.
[0093] Based on this feature center, we can further construct an intra-class compact loss to constrain similar samples to cluster towards its center, which can be expressed as:
[0094] (11)
[0095] This loss achieves a contraction constraint on the distribution of features within a class by measuring the distance between sample features and their class centers.
[0096] Next, by weighting and combining the cross-entropy loss and the intra-class compaction loss, we can obtain the overall training objective function:
[0097] (12)
[0098] in, This is a weighting coefficient used to balance the relationship between classification discrimination ability and intra-class compactness constraints.
[0099] Subsequently, by adjusting the overall training objective function Perform iterative optimization to obtain the model parameters after training. After model training is complete, the cosine similarity classification head used for supervised training is removed, and the embedding map, positional encoding, Transformer encoder, and projection normalization module are retained to obtain the final feature representation model:
[0100] (13)
[0101] This feature representation model can map the local oscillator leakage spectral features obtained in step 2 to a stable feature space, so that the samples collected by the same legitimate receiver under different operating states and environmental conditions have feature consistency, and the samples between different legitimate receivers have good separability, providing a basis for the construction of the prototype feature library of the subsequent reference sample subset and the matching decision of the test sample.
[0102] In step 4, the reference sample subset obtained in step 2 is... The reference prototype sample used for prototype construction is input into the feature representation model trained in step 3. The corresponding reference feature vectors are obtained, and a prototype feature library is constructed according to the legitimate receiver categories. Specifically:
[0103] First, the reference sample subset The local spectral feature vectors of each reference prototype sample are input into the feature representation model obtained in step 3. The corresponding reference feature vector can be obtained:
[0104] (14)
[0105] in, This represents the local spectral feature vector of the local oscillator leakage corresponding to the i-th reference sample. This represents the completed feature representation model. Through this mapping, the subset of reference samples... The local spectral features can be uniformly mapped to a discriminative feature space consistent with the training phase in step 3.
[0106] For each class of legitimate receivers, all its reference feature vectors are collected. To construct prototype features that characterize the distribution center of legitimate receiver classes, one or more prototype vectors are extracted from each class's reference feature set. The prototype features of the c-th class of legitimate receivers can be constructed using within-class mean aggregation, and its prototype vector can be represented as:
[0107] (15)
[0108] in, This represents the number of reference samples for the c-th type of legitimate receiver. Let represent the set of reference feature vectors corresponding to the c-th class of legitimate receivers. By aggregating the features of reference samples of the same class using the mean, the impact of instantaneous fluctuations of individual samples on the class representation can be reduced, thereby improving the stability of subsequent matching decisions.
[0109] To mitigate the adverse effects of differences in absolute amplitude among different samples on subsequent similarity matching, the aforementioned prototype vectors can be modified. Normalization is performed:
[0110] (16)
[0111] in, This represents the normalized prototype features of the c-th class of legitimate receivers.
[0112] However, the local oscillator leakage spectral characteristics of the same legitimate receiver may still exhibit some distribution drift under different operating states, temperature conditions, obstruction conditions, or receiving attitudes. Therefore, intra-class clustering can be performed on the reference feature sets of the same category to construct multiple sub-prototypes, thereby enabling a more refined characterization of the feature distribution of that category. For the reference feature vector set corresponding to the c-th class of legitimate receivers, the K-means clustering method can be used to divide it according to the distance between feature vectors. Each reference feature vector belongs to only one of these subclusters, and the subclusters do not overlap. Their union constitutes the set of all reference features for that class.
[0113] The k-th sub-cluster of the c-th class of legal receivers also uses the intra-class mean aggregation method to construct the corresponding sub-prototype vector, which can be represented as:
[0114] (17)
[0115] in, This represents the set of reference feature vectors corresponding to the k-th sub-cluster of the c-th class of legal receivers. This represents the number of feature vectors within the k-th sub-cluster of the c-th class of legal receivers. When only one prototype is extracted per class, it can be considered as... In special circumstances, at this time .
[0116] After normalization, we get:
[0117] (18)
[0118] This allows us to construct a prototype feature library from the prototype features corresponding to all legitimate receiver categories. This prototype feature library can then be used for similarity matching and legitimacy determination between the test sample and the legitimate receiver category.
[0119] In step 5, the same spectrum analysis and preprocessing procedure as in step 2 is performed on the radio frequency observation signal under test to extract the local spectral features corresponding to the local oscillator leakage, and the extraction results are input into the feature representation model trained in step 3. This allows us to obtain the feature vector of the sample to be tested. Specifically:
[0120] Let the radio frequency observation signal to be measured be This includes communication components, receiver-generated local oscillator leakage components, external interference, and noise components. To ensure the test sample resides in the same feature space as the training and reference samples constructed in step 2, the same preprocessing parameters and procedures as in step 2 are used, including the same window function, monitoring bandwidth, local bandwidth, fixed frequency grid, dynamic range clipping parameters, and normalization method. The resulting local spectral feature vector of the test sample is then obtained. .
[0121] Input the aforementioned local spectral feature vector to be tested into the feature representation model trained in step 3. ,
[0122] (19)
[0123] Through the above processing, the radio frequency observation signal under test undergoes the same preprocessing, local spectrum extraction, and feature mapping process as in step 2, transforming it into a feature vector under test located in the same feature space as the prototype feature library in step 4. This also provides an input basis for subsequent similarity judgment and legality determination with the features of the legitimate receiver prototype.
[0124] In step 6, the feature vector of the sample to be tested obtained in step 5 is... Similarity matching is performed between the sample and the prototype features of each legitimate receiver in the prototype feature library constructed in step 4 to obtain the category matching score corresponding to each legitimate receiver category, and further, the legitimacy discrimination score of the sample to be tested is obtained. Specifically:
[0125] Known feature vector of the sample to be tested With prototype features Located in the same feature space, cosine similarity can be used to calculate the closeness between the feature vector of the test sample and each sub-prototype feature. Since both the feature vector of the test sample and the prototype features have been normalized, the test sample and the prototype features are similar. The first class of legal receivers The similarity between individual prototypes can be represented as:
[0126] (20)
[0127] For the same legitimate receiver category, since it can correspond to one or more sub-prototypes, the maximum similarity result of multiple sub-prototypes within the same category or the average of the Top-N similarities is taken as the matching score for that category, where N≥2. Therefore, the category matching score for the c-th legitimate receiver category is defined as:
[0128] (twenty one)
[0129] A higher category matching score indicates that the sample is closer to the distribution center of that category in the feature space. Based on the category matching score vector, the optimal matching category for the sample can be further determined.
[0130] (twenty two)
[0131] Let the maximum category matching score be:
[0132] (twenty three)
[0133] To enhance the discrimination capability of unknown receivers in open scenarios, a validity discrimination score can be constructed using the score difference between the best and second-best matching categories. The second-largest score in the category matching score is represented as:
[0134] (twenty four)
[0135] The validity score of the sample to be tested can be defined as:
[0136] (25)
[0137] in, For the weighting coefficients, satisfying .
[0138] When the sample to be tested has a high similarity to a certain legal receiver category and maintains a high degree of differentiation from other categories, A larger value indicates a lower likelihood of similarity to a particular category; conversely, a smaller value indicates a lower likelihood of similarity to a particular category, but also similarity to multiple categories, or a lack of similarity to all legal categories. It will decrease.
[0139] In step 7, the validity score obtained in step 6 is used to determine the validity. The system performs eavesdropping detection on the sample under test. When the legality discrimination score of the sample under test does not meet the discrimination threshold, the sample under test is identified as a potential illegal receiver and a detection alarm is output.
[0140] Specifically, in this embodiment, the legality judgment condition is set as follows:
[0141] (26)
[0142] in, This represents the legality threshold. If the sample to be tested satisfies the condition in equation (26), then the sample to be tested is determined to belong to the first... If the receiver is classified as a legitimate receiver, it is determined that it does not belong to any registered legitimate receiver category and is marked as a potential illegitimate receiver.
[0143] In practical applications, the discrimination threshold in equation (26) can be determined by the distribution of the legitimacy discrimination scores of the reference calibration samples of legitimate receivers. The set of legitimacy discrimination scores of the reference calibration samples of registered legitimate receivers is calculated, and the corresponding quantiles are selected as thresholds according to the preset target acceptance rate of legitimate samples. If the preset target acceptance rate of legitimate samples is η, then the 1−η quantiles of the set of legitimacy discrimination scores of the reference calibration samples are determined as the legitimacy discrimination threshold. .
[0144] Through the above judgment process, the system can identify whether the sample under test belongs to a certain registered legitimate receiver category, and can reject unknown receivers that are not on the whitelist in open scenarios, thereby realizing the identification and alarm of potential illegal receivers based on the local oscillator leakage local spectrum characteristics.
[0145] To enable those skilled in the art to better understand the present invention, the performance of the local oscillator leakage illegal receiver detection method based on Transformer feature representation and prototype matching in this embodiment under a specific system configuration is shown below.
[0146] The simulation data was generated using the MATLAB platform, modeling a receiver local oscillator leakage scenario in the 2.4GHz target communication band. After spectral analysis of the observed signal, a 0.8 MHz local frequency band was truncated around the center frequency of the local oscillator leakage. This band was then processed through linear interpolation resampling, logarithmic compression, 60 dB dynamic range clipping, and normalization to obtain a 129-dimensional local spectral feature vector. In this process, the communication component was used as a broadband background component in the scenario modeling, but not as the primary basis for device identification. The model mainly utilizes the local oscillator leakage spectral peak and its neighborhood spectral structure for legitimacy identification. The dataset included 30 classes of registered legitimate receivers and 20 classes of unknown receivers. Legitimate receiver samples were used for model training, prototype library construction, and legitimacy testing, while unknown receiver samples were used for open-set rejection testing. The feature representation model adopted a structure combining two layers of convolution and Transformer encoding. First, the input vector was mapped to a high-dimensional feature space through two layers of one-dimensional convolution, with a kernel size of 5 and channel numbers of 96 and 192, respectively, and the GELU activation function was used. The Transformer encoder layers are set to 4, the model feature dimension to 192, and the multi-head attention heads to 8, ultimately outputting 192-dimensional normalized embedding features. During training, a cosine classification head is used for supervised learning, and the AdamW optimizer is employed with a learning rate of 2×10⁻⁶. −4 The batch size is 128. During the prototype matching phase, three sub-prototypes are constructed for each class of legitimate receivers, and the average of the two highest similarity scores within the same class is taken as the class matching score, with weighting coefficients... The value is 0.35, which is the legality threshold. The target valid sample acceptance rate was set to 0.95, determined by the reference calibration sample.
[0147] Figure 2 The ROC (Receiver Operating Characteristic) curve of the method of this invention on the test set is shown. Figure 2 It can be seen that the AUROC (area under the ROC curve) of the method reaches 0.9443, indicating that the method can effectively distinguish between registered legitimate receiver samples and unknown receiver samples outside the whitelist.
[0148] Figure 3 Key performance indicators are presented, where AUPR-U represents the area under the precision-recall curve for the unknown class; AUPR-K represents the area under the precision-recall curve for the legitimate class; TAR is the proportion of legitimate receivers correctly accepted; and FPR is the proportion of unknown receivers mistakenly accepted as legitimate receivers. In this embodiment, AUPR-U is 0.842, AUPR-K is 0.977, the legitimate sample acceptance rate is 0.912, the illegal sample rejection rate is 0.824, and TAR@1%FPR, TAR@5%FPR, and TAR@10%FPR are 0.688, 0.848, and 0.887, respectively. These results demonstrate that, under open-set testing conditions, this invention can not only identify legitimate receiver categories within the whitelist but also effectively reject unknown receiving devices outside the whitelist.
[0149] Figure 4 The distribution of legitimacy scores and the discrimination threshold for legitimate and illegitimate receiver samples are displayed. It can be seen that the discrimination scores of legitimate receiver samples are generally higher than those of unknown receiver samples, and the two show a clear distinguishing trend in their score distributions. The legitimacy discrimination threshold is 0.509. When the discrimination score of a sample exceeds this threshold, the system classifies it as a legitimate receiver; otherwise, it is classified as a potentially illegitimate receiver. As the discrimination threshold increases, the false acceptance rate of illegitimate receivers decreases, but the false rejection rate of legitimate receivers increases accordingly.
[0150] This invention also discloses a computer system, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the computer program is executed by the processor, it implements the steps of the eavesdropping detection method based on Transformer feature representation and communication signal enhancement.
[0151] The program code used to implement the method of the present invention can be written in any combination of one or more programming languages. This program code can be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that when executed by the processor or controller, the program code causes the steps of the method of the present invention to be performed. The program code can be executed entirely on the machine, partially on the machine, partially on the machine and partially on a remote machine as a standalone software package, or entirely on a remote machine or server. All aspects not detailed in this invention are well-known to those skilled in the art.
Claims
1. A method for eavesdropping detection based on Transformer feature representation and communication signal enhancement, characterized in that, Includes the following steps: Step 1: When the legitimate communication link is in normal working condition, collect radio frequency observation signals containing local oscillator leakage components for registered legitimate receivers under different working conditions and environmental conditions to form a legitimate receiver observation sample set; Step 2: Perform spectrum analysis and preprocessing on the legal receiver observation sample set, extract the local spectrum features corresponding to the local oscillator leakage, and construct a training sample subset and a reference sample subset; Step 3: Train the Transformer-based feature representation model based on the training sample subset. The feature representation model is used to extract the feature vector of the local oscillator leakage spectrum of the receiver. Step 4: Input the reference prototype samples used for prototype construction from the reference sample subset into the trained feature representation model to obtain the corresponding feature vectors, and construct the prototype feature library according to the legitimate receiver category; Step 5: Perform the same preprocessing and local feature extraction as in Step 2 on the radio frequency observation signal to be tested, obtain the sample to be tested, and input it into the feature representation model to obtain the feature vector of the sample to be tested; Step 6: Perform similarity matching between the feature vector of the sample to be tested and the features of each legitimate receiver prototype in the prototype feature library to obtain the matching score of each category, as well as the best matching category and the second best matching category. Construct the legality judgment score of the sample to be tested using the score difference between the best matching category and the second best matching category. Step 7: Perform eavesdropping detection on the sample to be tested based on the matching results: When the legality discrimination score of the sample to be tested does not meet the discrimination threshold condition, the sample to be tested is judged as a potential illegal receiver and a detection alarm is output.
2. The eavesdropping detection method based on Transformer feature representation and communication signal enhancement according to claim 1, characterized in that, In step 1, by changing the operating state and environmental conditions between the receiver and the transmitter, multiple observation windows are collected to form observation samples corresponding to each legal receiver category. Changes in the operating state include changes in terminal orientation and temperature, while changes in environmental conditions include changes in relative distance and obstruction. The collected radio frequency observation signals are represented as follows: ;in, Indicates time, represents the communication signal component, This indicates the leakage component of the local oscillator. The disturbance term is represented as follows: The legitimate receiver observation sample set is represented as: ;in, This represents the discrete observation sequence corresponding to the i-th sampling window, containing a fixed number of sampling points. It is obtained by discretizing the continuous-time observation signal according to a preset sampling period. This represents the valid receiver category identifier corresponding to the i-th sampling window. This represents the total number of samples observed in the whitelist.
3. The eavesdropping detection method based on Transformer feature representation and communication signal enhancement according to claim 1, characterized in that, Step 2 involves performing spectral analysis and preprocessing on the legitimate receiver observation sample set to extract the local spectral features corresponding to local oscillator leakage, specifically including: For each discrete observation sequence in the legal receiver observation sample set, after DC removal and windowing processing, a spectrum transformation is performed to obtain the corresponding frequency domain representation and amplitude spectrum. The frequency position corresponding to each spectrum point is determined by combining the sampling frequency. Determine the leakage center frequency, extract a local frequency band of a preset width around the leakage center frequency, and perform fixed-length interpolation resampling on the local spectrum according to the target frequency grid to obtain a fixed-length local spectrum with a unified dimension; Logarithmic compression, dynamic range pruning, and normalization are performed on the fixed-length local spectrum to obtain the local spectral feature vector corresponding to the discrete observation sequence.
4. The eavesdropping detection method based on Transformer feature representation and communication signal enhancement according to claim 3, characterized in that, Step 2 involves constructing the training sample subset and the reference sample subset, specifically including: Constructing a local spectral feature sample set for legitimate receivers: ;in, This represents the local spectral feature vector of the local oscillator leakage corresponding to the i-th sample. This represents the valid receiver category identifier corresponding to the i-th sample. This indicates the total number of samples observed in the whitelist; Local spectrum feature sample set of legitimate receivers Divided into training sample subsets and reference sample subset ;in, and Let represent the training sample index set and the reference sample index set, respectively, and satisfy the following: The training sample subset and reference sample subset All samples were collected under different categories of legal receivers, as well as under different working conditions and environmental conditions.
5. The eavesdropping detection method based on Transformer feature representation and communication signal enhancement according to claim 1, characterized in that, Step 3 involves training the Transformer-based feature representation model based on the aforementioned subset of training samples, specifically including: The local feature spectrum feature vectors of the training samples are arranged into an input sequence according to frequency order. After embedding mapping and position encoding, and adding a global representation label vector at the beginning of the sequence, the sequence is fed into a feature representation network composed of multiple stacked Transformer encoders. The output corresponding to the global representation label is extracted, and nonlinear projection and normalization are performed to obtain the embedding representation of the training sample in the feature space; where the embedding representation of the i-th training sample in the feature space is: ;in, This represents the local spectral feature vector of the local oscillator leakage corresponding to the i-th training sample. Represents the set of training sample indices. Representation of feature representation model, The trainable parameters of the feature representation model are represented; a cosine similarity classification head is introduced during the training phase to calculate the category output score; wherein, the classification output of the i-th training sample for the c-th class is represented as: ;in, This represents the class weight vector corresponding to the trainable c-th class. Indicates the scaling factor. The superscript represents the classification score of the i-th training sample belonging to the c-th class of legitimate receivers. Indicates transpose. Represents the 2-norm of a vector; Based on the class labels of a subset of training samples, the cross-entropy loss is defined as: ;in, This represents the number of training samples, where C is the total number of valid receiver classes, and k is the class index. This represents the category identifier of the legitimate receiver corresponding to the i-th training sample. This indicates that the i-th training sample belongs to the i-th training sample. Class classification score, This represents the classification score of the i-th training sample belonging to the k-th class; For each type of legitimate receiver, a feature center is constructed, and the first feature center in the training sample subset is... The feature centers of a class sample are represented as follows: ;in, Let c be the number of training samples of class c. Indicates the first The center position of a class sample in the feature space; Construct an intra-class compactness loss to constrain similar samples to cluster towards its center, expressed as: ; The weighted combination of cross-entropy loss and intra-class compaction loss yields the overall training objective function: ;in, These are the weighting coefficients; By adjusting the overall training objective function Perform iterative optimization to obtain the model parameters after training. After the model training is complete, the cosine similarity classification head used for supervised training is removed to obtain the final feature representation model. .
6. The eavesdropping detection method based on Transformer feature representation and communication signal enhancement according to claim 1, characterized in that, Step 4 involves constructing a prototype feature library according to the legitimate receiver category, specifically including: Input the local spectral feature vectors of each reference prototype sample in the reference sample subset into the feature representation model obtained in step 3 to obtain the corresponding reference feature vectors; For each class of legitimate receivers, collect all its reference feature vectors; construct the prototype features of the c-th class of legitimate receivers using intra-class mean aggregation, and the prototype feature vector is represented as: ;in, This represents the reference feature vector corresponding to the i-th reference sample. This represents the number of reference samples for the c-th type of legitimate receiver. This represents the set of reference feature vectors corresponding to the c-th type of legitimate receiver; For prototype feature vectors Normalization is performed: ;in, This represents the normalized prototype features of the c-th class of legitimate receivers.
7. The eavesdropping detection method based on Transformer feature representation and communication signal enhancement according to claim 6, characterized in that, The reference feature vector corresponding to the c-th type of legitimate receiver is divided into... Each reference feature vector belongs to only one sub-cluster, and the sub-clusters do not overlap. The union of the subclusters constitutes the set of all reference features of class c; The k-th sub-cluster of the c-th class of legal receivers is constructed using intra-class mean aggregation to create its corresponding sub-prototype feature vector, denoted as: ;in, This represents the set of reference feature vectors corresponding to the k-th sub-cluster of the c-th class of legal receivers. This represents the number of feature vectors within the k-th sub-cluster of the c-th type of legitimate receiver; right After normalization, we get: .
8. The eavesdropping detection method based on Transformer feature representation and communication signal enhancement according to claim 1, characterized in that, In step 6, cosine similarity is used to calculate the degree of closeness between the feature vector of the sample to be tested and the features of each legitimate receiver prototype; When multiple sub-prototypes correspond to the same legitimate receiver category, the maximum value of the similarity results of multiple sub-prototypes within the same category or the average of the Top-N similarities is taken as the matching score for that category, where N≥2; Based on the category matching score vector, determine the optimal matching category for the test sample: ;in, This represents the matching score of the c-th class of legitimate receivers, where C represents the total number of legitimate receiver classes. A validity score is constructed using the score difference between the best and second-best matching categories. The second-largest score in the category matching score is represented as: ; The validity score of the sample to be tested is defined as: ;in, For the highest category matching score, Let be the weighting coefficient, satisfying .
9. The eavesdropping detection method based on Transformer feature representation and communication signal enhancement according to claim 1, characterized in that, The legality criteria in step 7 are: ;in, The score represents the validity score of the sample to be tested. The legality threshold is determined based on the legality score distribution of the reference calibration samples used for threshold calibration in the reference sample subset. If the sample to be tested meets the legality criteria, it is determined that the sample to be tested belongs to the best-matched legal receiver category; otherwise, it is determined that it does not belong to any registered legal receiver category and is marked as a potential illegal receiver.
10. A computer system comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the computer program is executed by the processor, it implements the steps of the eavesdropping detection method based on Transformer feature representation and communication signal enhancement according to any one of claims 1-9.