A data augmentation method for radio frequency fingerprinting

By using a random slicing method based on Euclidean distance and phase angle, along with random scaling and phase rotation, combined with Gaussian white noise, diverse data samples are generated. This solves the problems of recognition accuracy and security in complex environments for radio frequency fingerprint recognition technology, and improves the robustness and generalization ability of the model.

CN116546504BActive Publication Date: 2026-06-12UNIT 63892 OF PLA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
UNIT 63892 OF PLA
Filing Date
2023-03-31
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Traditional radio frequency fingerprint recognition technology suffers from performance degradation and overfitting issues in unknown environments when faced with the complex electromagnetic environment of wireless networks and insufficient training data, resulting in insufficient recognition accuracy and security.

Method used

A random slicing method based on Euclidean distance and phase angle discrimination is adopted, combined with random scaling and phase rotation, and Gaussian white noise is superimposed to generate diverse data augmentation samples, thereby enhancing the robustness of the RF fingerprint recognition model.

🎯Benefits of technology

It improves the recognition accuracy and security of the radio frequency fingerprint recognition model in complex environments, alleviates the overfitting problem, and enhances the model's generalization ability.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116546504B_ABST
    Figure CN116546504B_ABST
Patent Text Reader

Abstract

The application discloses a data enhancement method for radio frequency fingerprint identification, which comprises the following steps: reading and preprocessing a piece of I / Q data collected in advance, if the I / Q data is complex baseband data, keeping it unchanged, otherwise converting it into complex baseband I / Q data; using the maximum value of complex envelope to normalize the complex baseband data to obtain normalized complex baseband data; randomly slicing the normalized complex baseband data to generate a slice data matrix; then, cutting the slice data matrix into sub-slices; randomly scaling and rotating the slice data matrix to obtain a slice matrix; superimposing different signal-to-noise ratio Gaussian white noises on the slice matrix to obtain a slice matrix subjected to random cutting, scaling, rotating and Gaussian white noise; and repeatedly performing the above steps according to different simulation requirements to continuously generate data enhancement samples. The application can improve the robustness of the radio frequency fingerprint and enhance the diversity of the radio signal sample.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of radio frequency fingerprint recognition technology, and in particular relates to a data enhancement method for radio frequency fingerprint recognition. Background Technology

[0002] In recent years, the rapid development of wireless communication technology, especially the widespread application of 5G, has driven the rapid growth in the number of IoT devices. Currently, the massive number of IoT devices connected to wireless networks presents unprecedented security challenges, driving the development of intelligent, reliable, and legitimate device authentication systems. Traditional authentication systems based on communication protocols and key authentication suffer from problems such as easy key theft and duplication, failing to meet the security requirements of large-scale IoT systems. Simultaneously, the expanding number of frequency-using devices has led to an increasingly complex electromagnetic environment, making the maintenance of radio order and security, and the prevention of unauthorized users, increasingly important for promoting the modernization of the national radio governance system and capabilities. In contrast, radio frequency fingerprinting technology, based on the physical layer characteristics of signals, can identify individual communication radiation sources without relying on keys, offering advantages such as high accuracy and strong security. Furthermore, since the radio frequency fingerprinting algorithm is deployed on the server side, no additional modifications are needed to the client's transmitter, making it highly promising for low-power, low-cost applications such as the IoT.

[0003] Transmitter performance drift, changes in wireless channel environment, and variations in environmental conditions can alter the distribution of training and test data, potentially causing severe performance degradation of RFID models in unknown environments. In non-cooperative conditions, insufficient training data also increases the risk of overfitting. Data augmentation methods can mitigate the problem of scarce training samples to some extent, and by introducing data diversity, they can alleviate overfitting, reduce sensitivity to confounding factors such as wireless channels and frequency drift, and improve the model's generalization ability. Summary of the Invention

[0004] To address the aforementioned problems, the present invention aims to provide a data augmentation method for radio frequency fingerprint recognition, which can improve the robustness of radio frequency fingerprints, enhance the diversity of radio signal samples, and simplify the data augmentation process.

[0005] To achieve the above-mentioned objectives, the present invention adopts the following technical solution:

[0006] A data augmentation method for radio frequency fingerprint recognition includes the following steps:

[0007] S1. Read a pre-acquired segment of I / Q data and preprocess it. If the I / Q data is complex baseband data, it remains unchanged; otherwise, it is converted to complex baseband I / Q data. The complex baseband I / Q data is represented as follows: , Where I represents the real part of the signal, Q represents the imaginary part of the signal, and M represents the number of sample points;

[0008] S2. Normalize the complex baseband data Y from step S1 using the maximum value of the complex envelope to obtain the normalized complex baseband data Y. n ,Right now: ;

[0009] S3. The normalized complex baseband data Y from step S2 n Perform random slicing to generate a slice data matrix of size M and length N. Then slice the data matrix The slices are cut into 0 sub-slices, and the connection points between adjacent sub-slices are selected by Euclidean distance and phase difference discrimination to maintain the continuity between the sub-slices.

[0010] S4. Processing the sliced ​​data matrix after step S3 Random scaling and phase rotation are performed based on the row and confounding factor duration T to obtain the slice matrix. ;

[0011] S5. Slice the matrix from step S4. By superimposing Gaussian white noise with different signal-to-noise ratios (SNR), a slice matrix is ​​obtained that has undergone random cropping, scaling, rotation, and Gaussian white noise treatment. ;

[0012] S6. According to different simulation requirements, set the number of slices M, slice length N, and number of sub-slices O. By adjusting the scaling factor ρ, rotation factor φ, duration of confounding factors T, and signal-to-noise ratio SNR, repeat steps S2 to S5 to continuously generate diverse and non-repetitive data augmentation samples.

[0013] Furthermore, in step S3 above, the sliced ​​data matrix is... Cut into O sub-slices, including the following sub-steps:

[0014] S3.1, Let This represents the starting point of the j-th sub-slice of the i-th data slice, where... There are a total of M data slices; if the length of a sub-slice is M... s Each data slice consists of O sub-slices, where the slice length, sub-slice length, and number of sub-slices satisfy the relationship M=M s *O, and all are positive integers; This represents the starting point of the first sub-slice of the i-th data slice, and A total of M starting points were randomly generated, corresponding to M data slices;

[0015] S3.2, In complex baseband data Y n In the middle, and M after that s -1 sample values ​​are assigned to the first sub-slice Slice1Sub1. ,in, This indicates the end point of the first sub-slice of the first data slice, and ;

[0016] S3.3, from the end point of the first sub-slice of the first data slice. Initially, after skipping samples of a fixed length l backward, a search is conducted for sample points that satisfy certain Euclidean distance and phase difference conditions, which serve as the starting point for the second sub-slice, Slice1Sub2. ;

[0017] S3.4 Repeat step S3.3 to generate the remaining O-2 sub-slices in sequence. Then, connect all O sub-slices end to end and splice them together to generate the first data slice.

[0018] S3.5. Using the randomly generated remaining M-1 slice starting points, repeat steps S3.1 to S3.4 to generate the remaining M-1 data slices; then stack all data slices row by row to form a slice data matrix. .

[0019] Furthermore, in step S3 above, selecting connection points between adjacent sub-slices using Euclidean distance and phase difference includes the following sub-steps:

[0020] S3a, Assume the endpoint of the j-th sub-slice of the i-th data slice is After skipping a fixed length of l sample points backward, the search begins for sample points that satisfy the following two conditions to serve as the starting point of the (j+1)th sub-slice. :

[0021] (1) and The Euclidean distance is less than the threshold. ,Right now: ;

[0022] (2) arrive , arrive The trend of phase angle change and arrive The phase angle changes in a consistent trend, that is:

[0023] .

[0024] Furthermore, in step S4 above, the sliced ​​data matrix in step S3 is... Random scaling and phase rotation are performed based on the row and confounding factor duration T to obtain the slice matrix. , It includes the following sub-steps:

[0025] S4.1 Assume that the influence of confounding factors remains essentially constant within the coherence time of CL samples, where CL is 1 / 4 of the sub-slice length, i.e., CL = 0 / 4. Then, within one data slice length, the number of changes in confounding factors is CC = N / CL.

[0026] S4.2, Set the scaling factor Phase rotation factor First, the fundamental matrix of the scaling factor is generated. The fundamental matrix of phase rotation factor Then, the fundamental matrix of the scaling factor is calculated according to the coherence time length CL. The fundamental matrix of phase rotation factor Expanded into a scale scaling factor matrix and phase rotation factor matrix ;

[0027] S4.3, Apply the formula to the slice matrix. Perform scaling and rotation.

[0028] Due to the adoption of the technical solution described above, the present invention has the following advantages:

[0029] This invention provides a data augmentation method for radio frequency fingerprint recognition. Addressing the problem that the recognition model in radio frequency fingerprint recognition tends to learn the radio signal protocol rather than the hardware fingerprint features of the radiation source due to the presence of a segment indicating the radiation source ID in the signal segment to be identified, the method employs a random slicing method based on Euclidean distance and phase angle discrimination. This method minimizes the influence of the signal segment indicating the radiation source ID while maintaining the abrupt changes that affect radio frequency fingerprint recognition during random slice data splicing. Furthermore, random scaling and phase rotation are used to simulate the disturbances of confounding factors on the radio signal, thus solving the problem of unknown confounding factors such as the wireless channel environment in the test environment and the inability to configure simulation parameters such as the wireless channel model.

[0030] This invention provides a data augmentation method for radio frequency fingerprint recognition, which can be easily integrated into various radio frequency fingerprint recognition methods, continuously generating rich and varied augmented data, mitigating model overfitting, improving the robustness of radio frequency fingerprints, and is applicable to both offline and online data augmentation applications. Attached Figure Description

[0031] Figure 1This is a flowchart of the data enhancement method for radio frequency fingerprint recognition according to the present invention;

[0032] Figure 2 This is a schematic diagram of a random data slicing method based on Euclidean distance and phase difference discrimination;

[0033] Figure 3 This is a waveform diagram of the I-path and Q-path of a 16QAM modulated complex baseband signal;

[0034] Figure 4 This is a comparison diagram of the I-channel waveforms before and after data enhancement using the data enhancement method for radio frequency fingerprint recognition of the present invention;

[0035] Figure 5a This is a data constellation diagram after data enhancement using the data enhancement method for radio frequency fingerprint recognition of the present invention;

[0036] Figure 5b This is a data constellation diagram that has not been augmented using the data augmentation method for radio frequency fingerprint recognition of this invention. Detailed Implementation

[0037] The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments.

[0038] like Figure 1 As shown, a data augmentation method for radio frequency fingerprint recognition includes the following steps:

[0039] S1. Read a pre-acquired segment of I / Q data (radio signal) and preprocess it. If the I / Q data is complex baseband data, it remains unchanged; otherwise, it is converted into complex baseband I / Q data. The complex baseband I / Q data is represented as follows: , Where I represents the real part of the signal, Q represents the imaginary part of the signal, and M represents the number of sample points;

[0040] S2. Normalize the complex baseband data Y from step S1 using the maximum value of the complex envelope to obtain the normalized complex baseband data Y. n ,Right now: ;

[0041] S3. The normalized complex baseband data Y from step S2 n Perform random slicing to generate a slice data matrix of size M and length N. Further trimming of the data based on random data slices, resulting in a sliced ​​data matrix. The slices are cut into O sub-slices to minimize the presence of radio signal segments that could identify the individual radiation source, such as... Figure 2 As shown, it includes the following specific sub-steps:

[0042] S3.1, Let This represents the starting point of the j-th sub-slice of the i-th data slice, where... There are a total of M data slices; if the length of a sub-slice is M... s Each data slice consists of O sub-slices, where the slice length, sub-slice length, and number of sub-slices satisfy the relationship M=M s *O, and all are positive integers; This represents the starting point of the first sub-slice of the i-th data slice, and A total of M starting points were randomly generated, corresponding to M data slices;

[0043] S3.2, In complex baseband data Y n In the middle, and M after that s -1 sample values ​​are assigned to the first sub-slice Slice1Sub1. ,in, This indicates the end point of the first sub-slice of the first data slice, and ;

[0044] S3.3, from the end point of the first sub-slice of the first data slice. Initially, after skipping samples of a fixed length l backward, a search is conducted for sample points that satisfy certain Euclidean distance and phase difference conditions, which serve as the starting point for the second sub-slice, Slice1Sub2. ;

[0045] S3.4 Repeat step S3.3 to generate the remaining O-2 sub-slices in sequence. Then, connect all O sub-slices end to end and splice them together to generate the first data slice.

[0046] S3.5. Using the randomly generated remaining M-1 slice starting points, repeat steps S3.1 to S3.4 to generate the remaining M-1 data slices; then stack all data slices row by row to form a slice data matrix. ;

[0047] Furthermore, the connection points between adjacent sub-slices are selected using Euclidean distance and phase difference discrimination methods to maintain the continuity between the sub-slices, including the following specific sub-steps:

[0048] S3a, Assume the endpoint of the j-th sub-slice of the i-th data slice is After skipping a fixed length of l sample points backward, the search begins for sample points that satisfy the following two conditions to serve as the starting point of the (j+1)th sub-slice. :

[0049] (1) and The Euclidean distance is less than the threshold. ,Right now: ;

[0050] (2) arrive , arrive The trend of phase angle change and arrive The phase angle changes in a consistent trend, that is:

[0051] ;

[0052] S4. Processing the sliced ​​data matrix after step S3 Random scaling and phase rotation are performed based on the row and confounding factor duration T to obtain the slice matrix. To simulate the effects of noise and other confounding factors on radio signals in a real environment, such as wireless channel fading and transmitter phase noise.

[0053] Within a relatively short time period, the influence of confounding factors such as channel fading on the signal remains basically unchanged. Therefore, the scaling and rotation operations on the signal are not applied differently to each point, but rather a uniform scaling and rotation operation is performed on the samples within a short time period while the influence of confounding factors remains basically unchanged.

[0054] Includes the following sub-steps:

[0055] S4.1 Assume that the influence of confounding factors remains essentially constant within the coherence time of CL samples, where CL is 1 / 4 of the sub-slice length, i.e., CL = 0 / 4. Then, within one data slice length, the number of changes in confounding factors is CC = N / CL.

[0056] S4.2, Set the scaling factor Phase rotation factor First, the fundamental matrix of the scaling factor is generated. The fundamental matrix of phase rotation factor Then, the fundamental matrix of the scaling factor is calculated according to the coherence time length CL. The fundamental matrix of phase rotation factor Expanded into a scale scaling factor matrix and phase rotation factor matrix ;

[0057] S4.3, Apply the formula to the slice matrix. Perform scaling and rotation.

[0058] S5. The slice matrix after scaling and rotation in step S4 By superimposing Gaussian white noise with different signal-to-noise ratios (SNR), the attenuation effect of wireless signals with increasing distance is simulated, resulting in a slice matrix that has been randomly cropped, scaled, rotated, and subjected to Gaussian white noise. , ;

[0059] S6. According to different simulation requirements, set the number of slices M, slice length N, and number of sub-slices O. By adjusting the scaling factor ρ, rotation factor φ, duration of confounding factors T, and signal-to-noise ratio SNR, repeat steps S2 to S5 to continuously generate diverse and non-repetitive data augmentation samples.

[0060] Using a random data stream of length 60,000 as the input bitstream, a complex baseband signal is generated using Gray code and 16 QAM modulation. The shaping filter is a root-raised cosine filter with a roll-off factor of 0.2, a symbol rate (OutputSamplesPerSymbol) of 4, and a FilterSpanInSymbols of 10. The sampling rate is 1 kHz, generating a complex baseband signal with a size of 1 * 60,000. The I-channel and Q-channel time-domain waveforms of this complex baseband signal are shown below. Figure 3 As shown below, taking the aforementioned complex baseband signal as an example, we will demonstrate a specific embodiment of data enhancement using the data enhancement method for radio frequency fingerprint recognition of the present invention.

[0061] The present invention provides a data enhancement method for radio frequency fingerprint recognition, which includes the following specific steps:

[0062] Step 1: Normalization

[0063] (1.1) For the generated 1 The complex baseband signal with a size of 60000 and a 16QAM modulation is expressed according to the formula. For normalization processing, see [link / reference] Figure 3 The first 800 samples of the I-path and Q-path after normalization;

[0064] Step 2: Randomly slice the data and use Euclidean distance and phase difference to select connection points between adjacent sub-slices to maintain continuity between the sub-slices;

[0065] (2.1) Set the slicing parameters: the number of slices is M=20, the slice length is N=4096, and the number of sub-slices is O=4;

[0066] (2.2) According to Figure 2The random slicing method shown is based on Euclidean distance and phase difference discrimination. First, 20 numbers are randomly selected from 1 to 51808 as the starting point of sub-slice 1, and then 1024 samples are read as sub-slice 1.

[0067] (2.3) After skipping 1024 samples from the end point of sub-slice 1, continue searching for points whose Euclidean distance from the end point of sub-slice 1 is less than 0.05, i.e. Use this point as the starting point for the data in sub-slice 2;

[0068] (2.4) Starting from the data starting point of sub-slice 2, read 1024 samples forward to obtain sub-slice 2;

[0069] (2.5) Repeat steps (2.3) and (2.4) above to obtain sub-slice 3 and sub-slice 4 in sequence;

[0070] (2.6) Combine the above four sub-slices, sub-slice 1, sub-slice 2, sub-slice 3, and sub-slice 4, to obtain 1. A data slice of size 4096;

[0071] (2.7) Using the remaining 19 numbers as the starting point of sub-slice 1, repeat steps (2.2) to (2.6) above to finally obtain 20. A data slice matrix of size 4096;

[0072] Step 3: Scale and Phase Rotation

[0073] (3.1) Set the scaling factor Rotation factor The duration of confounding factors is CL=0 / 4=256;

[0074] (3.2) Generate scaling factor matrices row by row. and the matrix of phase rotation factors ;

[0075] (3.3) Using the formula Scaling and rotating the sliced ​​data matrix yields the slice matrix. ;

[0076] Step 4: Add Gaussian white noise

[0077] (4.1) Slicing the matrix by row Gaussian white noise with an SNR of 30dB was superimposed to obtain the augmented data after random cropping, scaling, rotation, and Gaussian white noise processing. ;

[0078] (4.2) For example Figure 4 As shown, the I-channel waveforms before and after processing one of the data slices using the data augmentation method of this invention are displayed; Figure 5a , Figure 5b As shown, this is a constellation diagram before and after processing of the data slice.

[0079] from Figure 4 As can be seen, after data enhancement using the data enhancement method for radio frequency fingerprint recognition of the present invention, no obvious discontinuities appeared at the splicing points of the sub-slices, and the enhanced signal maintained good continuity.

[0080] The present invention provides a data augmentation method for radio frequency fingerprint recognition, which can continuously generate richly varied augmented data samples based on a limited set of data samples, thereby mitigating model overfitting and improving the robustness of radio frequency fingerprints.

[0081] The above description is only a preferred embodiment of the present invention and not a limitation thereof. Any equivalent changes and modifications made in accordance with the scope of the present invention without departing from the spirit and scope of the present invention shall be within the scope of patent protection of the present invention.

Claims

1. A data augmentation method for radio frequency fingerprint recognition, characterized in that: It includes the following steps: S1. Read a pre-acquired segment of I / Q data and preprocess it. If the I / Q data is complex baseband data, it remains unchanged; otherwise, it is converted into complex baseband I / Q data. The complex baseband I / Q data is represented as follows: , Where I represents the real part of the signal, The imaginary part of the signal is represented by M, and the number of sample points is represented by M. S2. Normalize the complex baseband data Y from step S1 using the maximum value of the complex envelope to obtain the normalized complex baseband data Y. n ,Right now: ; S3. The normalized complex baseband data Y from step S2 n Perform random slicing to generate a slice data matrix of size M and length N. Then slice the data matrix The slice is cut into 0 sub-slices, and the connection points between adjacent sub-slices are selected using Euclidean distance and phase difference discrimination to maintain the continuity between the sub-slices; the selection of connection points between adjacent sub-slices using Euclidean distance and phase difference includes the following sub-steps: S3a, Assume the endpoint of the j-th sub-slice of the i-th data slice is After skipping a fixed length of l sample points backward, the search begins for sample points that satisfy the following two conditions to serve as the starting point of the (j+1)th sub-slice. : (1) and The Euclidean distance is less than the threshold. ,Right now: ; (2) arrive , arrive The trend of phase angle change and arrive The phase angle changes in a consistent trend, that is: ; S4. The sliced ​​data matrix processed in step S3 Random scaling and phase rotation are performed based on the row and confounding factor duration T to obtain the slice matrix. Within a time period during which the influence of confounding factors remains essentially unchanged, samples within the duration T of the confounding factors' influence are subjected to uniform scaling and rotation operations; this includes the following sub-steps: S4.1 Assume that the influence of confounding factors remains essentially constant over the coherence time of CL samples, where CL is 1 / 4 of the sub-slice length, i.e., CL = M. s / 4, then within a data slice length, the number of changes in confounding factors CC = N / CL; S4.2, Set the scaling factor Phase rotation factor First, generate the basic matrix of the scaling factor row by row. The fundamental matrix of phase rotation factor Then, the fundamental matrix of the scaling factor is calculated according to the length CL. The fundamental matrix of phase rotation factor Expanded into a scale scaling factor matrix and phase rotation factor matrix ; S4.3, Apply the formula to the slice matrix. Perform scaling and rotation; S5. Slice the matrix from step S4. By superimposing Gaussian white noise with different signal-to-noise ratios (SNR), a slice matrix is ​​obtained that has undergone random cropping, scaling, rotation, and Gaussian white noise treatment. ; S6. According to different simulation requirements, set the number of slices M, slice length N, and number of sub-slices O. By adjusting the scaling factor ρ, rotation factor φ, duration of confounding factors T, and signal-to-noise ratio SNR, repeat steps S2 to S5 to continuously generate diverse and non-repetitive data augmentation samples.

2. The data augmentation method for radio frequency fingerprint recognition according to claim 1, characterized in that: in step S3, the sliced ​​data matrix is... Cut into O sub-slices, including the following sub-steps: S3.1, Let This represents the starting point of the j-th sub-slice of the i-th data slice, where... There are a total of M data slices; if the length of a sub-slice is M... s Each data slice consists of O sub-slices, where the slice length, sub-slice length, and number of sub-slices satisfy the relationship M=M s *O, and all are positive integers; This represents the starting point of the first sub-slice of the i-th data slice, and A total of M starting points were randomly generated, corresponding to M data slices; S3.2, In complex baseband data Y n In the middle, and M after that s -1 sample values ​​are assigned to the first sub-slice Slice1Sub1. ,in, This indicates the end point of the first sub-slice of the first data slice, and ; S3.3, from the end point of the first sub-slice of the first data slice. Initially, after skipping samples of a fixed length l backward, a search is conducted for sample points that satisfy certain Euclidean distance and phase difference conditions, which serve as the starting point for the second sub-slice, Slice1Sub2. ; S3.4 Repeat step S3.3 to generate the remaining O-2 sub-slices in sequence. Then, connect all O sub-slices end to end and splice them together to generate the first data slice. S3.

5. Using the randomly generated remaining M-1 slice starting points, repeat steps S3.1 to S3.4 to generate the remaining M-1 data slices; then stack all data slices row by row to form a slice data matrix. .