An internet of vehicles abnormal behavior detection method based on a generative adversarial network
By combining image-based feature encoding and semi-supervised generative adversarial networks with convolutional block attention and adaptive thresholding, the problems of sample scarcity and model instability in abnormal behavior detection in vehicle-to-everything (V2X) systems are solved, achieving efficient and accurate abnormal behavior recognition and improving the security and real-time performance of V2X systems.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHONGQING UNIV OF POSTS & TELECOMM
- Filing Date
- 2026-03-11
- Publication Date
- 2026-06-19
Smart Images

Figure CN122241504A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of vehicle networking and relates to a method for detecting abnormal behavior in vehicle networking based on generative adversarial networks. Background Technology
[0002] With the rapid global deployment of connected and autonomous vehicles and cooperative intelligent transportation systems, vehicles have achieved close interconnection with everything through in-vehicle self-organizing networks and cellular networks. The core of this interconnection lies in the periodic broadcast of basic safety messages between vehicles, which include key temporal and physical characteristics such as vehicle position, speed, heading, and acceleration to support advanced functions such as collision avoidance, path planning, and cooperative driving. This has become the nerve center of intelligent transportation systems.
[0003] However, the openness, high dynamism, and decentralized nature of the Internet of Vehicles (IoV) present it with severe security challenges. Although basic vehicle security messages are authenticated with digital certificates through public key infrastructure (PKI) to ensure the credibility of the message sender's identity, the accuracy and authenticity of the message content cannot be guaranteed. Malicious nodes or compromised vehicles can use forgery, tampering, or replay techniques to broadcast incorrect information or deceptive data to the network, such as reporting false locations or instantaneous speeds. This can directly mislead the decisions of other vehicles, causing traffic congestion, system chaos, and even catastrophic safety incidents. Therefore, building an abnormal behavior detection system capable of efficiently, accurately, and in real-time verifying the physical consistency of basic security message content has become a key technological bottleneck in ensuring IoV security.
[0004] Existing technologies face challenges and limitations in addressing the aforementioned security requirements. In real-world connected vehicle environments, abnormal behavior is a low-frequency event, resulting in extremely scarce real-world abnormal samples for training deep learning models, and the cost of acquiring massive amounts of high-quality labeled data is extremely high. Most existing deep learning detection models heavily rely on large-scale, class-balanced supervised labeled data. This severely imbalanced training set, lacking sufficient labels, makes model training prone to bias towards the majority class of normal samples, leading to low recall rates for minority class abnormal samples. In safety-critical applications, missing even one real-world abnormal behavior can pose unacceptable risks. Furthermore, while some studies have introduced generative adversarial networks (GANs) to synthesize data and alleviate sample imbalance, traditional GANs are prone to pattern collapse during training, and standard convolutional frameworks struggle to effectively and adaptively focus on subtle abnormal texture features in vehicle motion data. Finally, detection architectures and decision-making mechanisms struggle to balance real-time performance and global security. Many studies still rely on traditional static decision thresholds, which are ill-suited to the dynamic constraints of high recall and low false positive rates required by connected vehicles. Given the limitations of existing technologies in terms of scarce sample labels, model training stability, and decision optimization, a novel anomaly detection framework is proposed. This framework can make full use of a large amount of unlabeled data through a semi-supervised adversarial learning paradigm to solve the problem of high labeling costs. It enhances the stability of model training and the ability to identify covert attacks through feature matching and attention mechanisms, and adopts adaptive threshold optimization to meet the stringent safety requirements of the Internet of Vehicles. Summary of the Invention
[0005] In view of this, the purpose of this invention is to provide a method for detecting abnormal behavior in vehicle networks based on generative adversarial networks.
[0006] To achieve the above objectives, the present invention provides the following technical solution:
[0007] S1: Image-based feature encoding: Collect basic vehicle safety information, normalize the multi-dimensional temporal physical features, and map them into a fixed-size RGB image tensor;
[0008] S2: Semi-supervised model construction: Construct a semi-supervised generative adversarial network model consisting of a generator and a discriminator. The generator uses a decoder structure to map random noise into pseudo-samples, and the discriminator integrates a convolutional block attention module and sets up an output layer with a total of K+1 dimensions including the real class and the pseudo-class.
[0009] S3: Convolutional Neural Network Improvement: Construct a convolutional neural network that combines channel attention and spatial attention, which enables the model to learn the importance weights of different physical feature channels and also captures the spatial features of the image.
[0010] S4: Semi-supervised adversarial training: The discriminator updates weights by weighted combination of supervised and unsupervised losses, and the generator reduces the distribution difference between real and pseudo samples in the feature layer inside the discriminator by feature matching loss to improve training stability.
[0011] S5: Adaptive threshold determination: Under the condition of satisfying the preset safety constraints, the probability space is traversed to dynamically search for the optimal classification determination threshold that maximizes the F1 score.
[0012] Furthermore, the specific process of S1 includes: Addressing the challenge of processing high-dimensional BSM time-series data, we designed an image encoding module for time-series physical features. This module aims to convert high-dimensional vehicle-to-everything (V2X) basic safety message time-series data into a two-dimensional RGB image representation that can be efficiently processed by convolutional neural networks, overcoming the limitations of traditional time-series analysis methods. In specific implementation, firstly, BSM data streams within a continuous time window are collected through onboard or roadside communication units, and key physical features representing the vehicle's motion state, including displacement, velocity, acceleration, and heading angle, are parsed from them. Considering that the difference in magnitude between different physical units may lead to model training divergence, the system adopts a min-max normalization algorithm to uniformly map all the above physical features to [the appropriate value] based on the historical statistical distribution of maximum and minimum values. Interval.
[0013] Building upon this foundation, we further defined and implemented an RGB channel mapping mechanism, assigning color semantics to the normalized physical features of the image: we precisely map positional features to the R channel, reflecting movement trajectories; velocity features to the G channel, reflecting driving speed; and acceleration features to the B channel, reflecting the intensity of driving behavior. This mapping successfully transforms temporal sequence evolution into spatial texture patterns, causing abnormal driving behavior to manifest as specific color breaks or mismatched texture anomalies in the image. Finally, we resample and align the aforementioned feature matrices using pixel transformation and bilinear interpolation algorithms to generate a standard RGB image that meets the preset resolution requirements, which serves as the input to the deep classifier.
[0014] Furthermore, in S2, to address the inherent problems of scarce abnormal behavior samples and extreme class imbalance in vehicle-to-everything (V2X) networks, a semi-supervised generative adversarial network (GAN) model consisting of a generator and a discriminator is constructed. This module aims to train a powerful classifier using a semi-supervised learning paradigm, leveraging both a small amount of labeled data and a large amount of unlabeled data. The specific steps are as follows:
[0015] First, after the convolutional layers of the classifier's backbone network, this embodiment embeds a convolutional block attention module. By connecting channel attention and spatial attention mechanisms, it guides the model to focus on the most discriminative feature regions. The specific steps are as follows:
[0016] S21. Implementation of the generator structure: A decoder architecture is adopted. First, it receives a 128-dimensional random noise vector input that follows a standard normal distribution; then, it maps and reshapes the noise vector into a generator of size [size missing]. The feature map is then passed through three transposed convolutional layers for progressive upsampling of spatial dimensions to generate a representation with the same dimensions as the real image. Pseudo-sample tensors. To alleviate the vanishing gradient problem and stabilize the feature distribution, each transposed convolutional operation is followed by a cascaded batch normalization layer and a linear correction unit activation layer with a parameter set to 0.2. At the output, a hyperbolic tangent function is used to map the generated pseudo-sample pixel values to... Interval.
[0017] S22. Construction of the discriminator backbone network: A feature extraction architecture consisting of multi-level convolutional blocks is adopted. Each convolutional block integrates two consecutive convolutional layers, a linear correction unit activation layer, a batch normalization layer, and a random deactivation layer with a dropout rate of 0.25, in order to extract spatiotemporal distribution features layer by layer and suppress overfitting.
[0018] S23. Classification Output Layer Design: At the end of the feature extraction network, the discriminator compresses the spatial dimension through a global average pooling layer, outputs a 256-dimensional deep feature vector via a fully connected layer, and finally connects to an inactive fully connected layer, with an output dimension of... The log-odds vector. Represents the number of true categories, the first The dimension is used to represent the probability score of the input sample belonging to each true class, the third dimension is used to represent the probability score of the input sample belonging to each true class. The dimension is specifically used to characterize the probability score of an input sample belonging to a pseudo-class synthesized by the generator.
[0019] Furthermore, in S3, to address the limitation of conventional convolution in adaptively focusing on abnormal physical textures of vehicles, we embed a convolutional block attention module within each feature extraction convolutional block of the discriminator. This module aims to guide the model to focus on the most discriminative feature regions by cascading channel attention and spatial attention mechanisms. The specific steps are as follows:
[0020] S31. Implementation of Channel Attention Mechanism: First, global average pooling and global max pooling are performed on the input feature map to compress spatial dimensional information; then, the input is fed into a shared multilayer perceptron network to learn the importance weights of different channels; finally, the two outputs are added together and a channel weight vector is generated through the Sigmoid activation function, which is used to perform channel dimension weighting calibration on the input feature map and strengthen the response of the channel where the abnormal signal is located.
[0021] S32. Implementation of the Spatial Attention Mechanism: Based on the channel-weighted feature map, average pooling and max pooling are performed along the channel axis to generate two two-dimensional feature descriptors. These descriptors are then concatenated and spatial information is fused through a 7*7 convolutional layer. A spatial attention map is generated using the Sigmoid function. This attention map is used to locate "pixel block" regions in the image that violate physical laws. Finally, the spatial weight map is multiplied with the feature map to complete the adaptive recalibration of the features.
[0022] Furthermore, in S4, to address the issues of traditional models' heavy reliance on large amounts of manually labeled data and the instability of generative adversarial networks (GANs) training, a semi-supervised adversarial training optimization module combining supervised loss, unsupervised loss, and feature matching loss is designed. This module aims to alternately optimize the discriminator and generator, forcing the model to learn robust features from massive amounts of unlabeled data. The specific steps are as follows:
[0023] S41. Constructing the supervised loss for the discriminator: This is calculated only for labeled samples with true labels. The sparse cross-entropy between the labeled samples and the true labels is calculated using the first K dimensions of the discriminator's output log odds. The formula is as follows: in, Represents the mathematical expectation; Showing labeled real samples and their corresponding real category labels Follows the labeled data distribution ; The total number of true categories; This indicates that the discriminator network predicts the input sample. and belong to the corresponding real category (Category Index) The probability value of ), and this loss term gives the discriminator the basic ability to classify accurately.
[0024] S42. Constructing the unsupervised loss for the discriminator: This function does not depend on the real label data. First, for all real samples, maximize the probability that they belong to a subset of the real class; second, for pseudo-samples synthesized by the generator, maximize the probability that they belong to the first... The probability of a class. The formula is as follows: in, Represents all real samples (With and without labels) follows the overall true data distribution ; This indicates that the discriminator predicts the true samples. Belongs to all The total probability of a subset of true categories; Represents a random noise vector Follows a standard normal distribution ; Indicates that the generator is based on noise Generated pseudo-samples; This indicates that the discriminator predicts spurious samples. Belongs to the The class, or pseudo-class, has the following index: The probability of.
[0025] Ultimately, the total loss of the discriminator is obtained by weighted summation of the supervised loss and the unsupervised loss, i.e. ,in, and These are the preset supervised loss weights and unsupervised loss weights, respectively, thus making full use of the information from the unlabeled data.
[0026] S43. Constructing the Generator's Feature Matching Loss: To avoid the pattern collapse problem in traditional generative adversarial networks, the conventional adversarial classification loss is abandoned. The 256-dimensional intermediate feature vector before the fully connected layer of the discriminator is extracted, and the mean distribution of the real sample set and the pseudo sample set in this feature space is calculated. Norm distance, used as the optimization objective of the generator, is formulated as follows: in, This represents the feature matching loss of the generator; This represents the high-dimensional intermediate feature representation vector extracted by the discriminator; This represents the L1 norm of a vector, i.e., the difference in absolute values.
[0027] S44. Perform alternating gradient updates: In each training step, sample real data and random noise, and calculate the total loss of the discriminator. The discriminator parameters are then updated via backpropagation; subsequently, random noise is resampled to generate pseudo-samples, and the feature matching loss is calculated. The generator parameters are then updated by backpropagation until the model converges.
[0028] Furthermore, in S5, to address the pain point that traditional fixed thresholds cannot meet the safety requirements of vehicle-to-everything (V2X) communication, we introduced an adaptive threshold search mechanism based on F1 score during the model validation phase. The system traverses the candidate threshold space and determines the globally optimal classification threshold by calculating the harmonic mean index under different thresholds. Its objective function is as follows: in, This represents the candidate probability threshold during the traversal process, and its value range is... ; This represents the globally optimal decision threshold determined after the search. This threshold will be solidified and deployed to subsequent real-time detection processes. Indicates the current threshold Precision rate, which is the proportion of truly anomalous samples out of all samples judged as anomalous, reflects the model's ability to combat false positives. Indicates the current threshold The recall rate, which is the proportion of all real outliers that are correctly detected, reflects the model's ability to combat false negatives. The formula as a whole represents finding a specific threshold that maximizes the harmonic mean of precision and recall (i.e., the F1 score), thereby achieving dynamic optimization of the overall system performance. Attached Figure Description
[0029] To make the objectives, technical solutions, and advantages of the present invention clearer, the preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings, wherein:
[0030] Figure 1 This is a flowchart illustrating a method for detecting abnormal behavior in vehicle-to-everything (V2X) networks based on generative adversarial networks.
[0031] Figure 2 This is a schematic diagram of the image encoding of temporal physical features.
[0032] Figure 3 Classification model architecture diagram Detailed Implementation
[0033] The following specific examples illustrate the implementation of the present invention. Those skilled in the art can easily understand other advantages and effects of the present invention from the content disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and various details in this specification can be modified or changed based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that the illustrations provided in the following embodiments are only schematic representations of the basic concept of the present invention. Unless otherwise specified, the following embodiments and features can be combined with each other.
[0034] The accompanying drawings are for illustrative purposes only and are schematic diagrams, not actual pictures. They should not be construed as limiting the invention. To better illustrate the embodiments of the invention, some parts in the drawings may be omitted, enlarged, or reduced, and do not represent the actual product dimensions. It is understandable to those skilled in the art that some well-known structures and their descriptions may be omitted in the drawings.
[0035] In the accompanying drawings of the embodiments of the present invention, the same or similar reference numerals correspond to the same or similar components. In the description of the present invention, it should be understood that if terms such as "upper," "lower," "left," "right," "front," and "rear" indicate the orientation or positional relationship based on the orientation or positional relationship shown in the drawings, they are only for the convenience of describing the present invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, the terms used to describe positional relationships in the drawings are only for illustrative purposes and should not be construed as limiting the present invention. For those skilled in the art, the specific meaning of the above terms can be understood according to the specific circumstances.
[0036] See Figures 1-3 This invention provides a method for detecting abnormal behavior in vehicle networks based on adversarial neural networks, wherein... Figure 1 This is a flowchart illustrating a method for detecting abnormal behavior in vehicle-to-everything (V2X) networks based on adversarial neural networks. Figure 2 This is a schematic diagram of the image encoding of temporal physical features. Figure 3 This is a diagram of the classification model architecture.
[0037] The following explanation, in conjunction with the accompanying drawings, includes the following steps:
[0038] S1: Image Feature Encoding: To address the challenges of processing high-dimensional BSM time-series data, we designed an image-based encoding module for time-series physical features. This module aims to convert high-dimensional vehicle-to-everything (V2X) Basic Safety Message (BSM) time-series data into a two-dimensional RGB image representation that can be efficiently processed by a convolutional neural network (CNN), overcoming the limitations of traditional time-series analysis methods. In practice, BSM data streams within a continuous time window are first acquired through onboard or roadside communication units, and key physical features representing the vehicle's motion state, including displacement, velocity, acceleration, and heading angle, are extracted. Considering that the difference in magnitude between different physical units may lead to model training divergence, the system employs a min-max normalization algorithm to uniformly map all the above physical features to a uniform value based on the historical statistical distribution of maximum and minimum values. Interval.
[0039] Building upon this foundation, we further defined and implemented an RGB channel mapping mechanism, assigning color semantics to the normalized physical features of the image: we precisely map positional features to the R channel, reflecting movement trajectories; velocity features to the G channel, reflecting driving speed; and acceleration features to the B channel, reflecting the intensity of driving behavior. This mapping successfully transforms temporal sequence evolution into spatial texture patterns, causing abnormal driving behavior to manifest as specific color breaks or mismatched texture anomalies in the image. Finally, we resample and align the aforementioned feature matrices using pixel transformation and bilinear interpolation algorithms to generate a standard RGB image that meets the preset resolution requirements, which serves as the input to a deep classifier (CNN).
[0040] S2. Semi-supervised Model Construction: To address the inherent problems of scarce abnormal behavior samples and extreme class imbalance in connected vehicles, a semi-supervised generative adversarial network (GAN) model consisting of a generator and a discriminator is constructed. This module aims to train a powerful classifier using a semi-supervised learning paradigm, leveraging both a small amount of labeled data and a large amount of unlabeled data. The specific steps are as follows:
[0041] First, after the convolutional layers of the classifier's backbone network, this embodiment embeds a Convolutional Block Attention (CBAM) module. By cascading channel attention and spatial attention mechanisms, it guides the model to focus on the most discriminative feature regions. The specific steps are as follows:
[0042] S21. Implementation of the generator structure: A decoder architecture is adopted. First, it receives a 128-dimensional random noise vector input that follows a standard normal distribution; then, it maps and reshapes the noise vector into a generator of size [size missing]. The feature map is then passed through three transposed convolutional layers for progressive upsampling of spatial dimensions to generate a representation with the same dimensions as the real image. Pseudo-sample tensors. To alleviate the vanishing gradient problem and stabilize the feature distribution, each transposed convolutional operation is followed by a cascaded batch normalization layer and a linear correction unit activation layer with a parameter set to 0.2. At the output, a hyperbolic tangent function is used to map the generated pseudo-sample pixel values to... Interval.
[0043] S22. Construction of the discriminator backbone network: A feature extraction architecture consisting of multi-level convolutional blocks is adopted. Each convolutional block integrates two consecutive convolutional layers, a linear correction unit activation layer, a batch normalization layer, and a random deactivation layer with a dropout rate of 0.25, in order to extract spatiotemporal distribution features layer by layer and suppress overfitting.
[0044] S23. Classification Output Layer Design: At the end of the feature extraction network, the discriminator compresses the spatial dimension through a global average pooling layer, outputs a 256-dimensional deep feature vector via a fully connected layer, and finally connects to an inactive fully connected layer, with an output dimension of... The log-odds vector. Represents the number of true categories, the first The dimension is used to represent the probability score of the input sample belonging to each true class, the third dimension is used to represent the probability score of the input sample belonging to each true class. The dimension is specifically used to characterize the probability score of an input sample belonging to a pseudo-class synthesized by the generator.
[0045] S3. Convolutional Neural Network Improvement: To address the limitation of conventional convolutional neural networks in adaptively focusing on abnormal physical textures of vehicles, we embedded a convolutional block attention module within each feature extraction convolutional block of the discriminator. This module aims to guide the model to focus on the most discriminative feature regions by concatenating channel attention and spatial attention mechanisms. The specific steps are as follows:
[0046] S31. Implementation of Channel Attention Mechanism: First, global average pooling and global max pooling are performed on the input feature map to compress spatial dimensional information; then, the input is fed into a shared multilayer perceptron network to learn the importance weights of different channels; finally, the two outputs are added together and a channel weight vector is generated through the Sigmoid activation function, which is used to perform channel dimension weighting calibration on the input feature map and strengthen the response of the channel where the abnormal signal is located.
[0047] S32. Implementation of the Spatial Attention Mechanism: Based on the channel-weighted feature map, average pooling and max pooling are performed along the channel axis to generate two two-dimensional feature descriptors. These descriptors are then concatenated and spatial information is fused through a 7*7 convolutional layer. A spatial attention map is generated using the Sigmoid function. This attention map is used to locate "pixel block" regions in the image that violate physical laws. Finally, the spatial weight map is multiplied with the feature map to complete the adaptive recalibration of the features.
[0048] S4. Semi-supervised Adversarial Training: To address the issues of traditional models' heavy reliance on large amounts of manually labeled data and the instability of generative adversarial networks (GANs) training, a semi-supervised adversarial training optimization module was designed, combining supervised loss, unsupervised loss, and feature matching loss. This module aims to alternately optimize the discriminator and generator, forcing the model to learn robust features from massive amounts of unlabeled data. The specific steps are as follows:
[0049] S41. Constructing the supervised loss for the discriminator: This is calculated only for labeled samples with true labels. The sparse cross-entropy between the labeled samples and the true labels is calculated using the first K dimensions of the discriminator's output log odds. The formula is as follows:
[0050] in, Represents the mathematical expectation; Showing labeled real samples and their corresponding real category labels Follows the labeled data distribution ; The total number of true categories; This indicates that the discriminator network predicts the input sample. and belong to the corresponding real category (Category Index) The probability value of ), and this loss term gives the discriminator the basic ability to classify accurately.
[0051] S42. Constructing the unsupervised loss for the discriminator: This function does not depend on the real label data. First, for all real samples, maximize the probability that they belong to a subset of the real class; second, for pseudo-samples synthesized by the generator, maximize the probability that they belong to the first... The probability of a class. The formula is as follows: in, Represents all real samples (With and without labels) follows the overall true data distribution ; This indicates that the discriminator predicts the true samples. Belongs to all The total probability of a subset of true categories; Represents a random noise vector Follows a standard normal distribution ; Indicates that the generator is based on noise Generated pseudo-samples; This indicates that the discriminator predicts spurious samples. Belongs to the The class, or pseudo-class, has the following index: The probability of.
[0052] Ultimately, the total loss of the discriminator is obtained by weighted summation of the supervised loss and the unsupervised loss, i.e. ,in, and These are the preset supervised loss weights and unsupervised loss weights, respectively, thus making full use of the information from the unlabeled data.
[0053] S43. Constructing the Generator's Feature Matching Loss: To avoid the pattern collapse problem in traditional generative adversarial networks, the conventional adversarial classification loss is abandoned. The 256-dimensional intermediate feature vector before the fully connected layer of the discriminator is extracted, and the mean distribution of the real sample set and the pseudo sample set in this feature space is calculated. Norm distance, used as the optimization objective of the generator, is formulated as follows: in, This represents the feature matching loss of the generator; This represents the high-dimensional intermediate feature representation vector extracted by the discriminator; This indicates the calculation of the L1 norm of a vector, i.e., the absolute difference.
[0054] S44. Perform alternating gradient updates: In each training step, sample real data and random noise, and calculate the total loss of the discriminator. The discriminator parameters are then updated via backpropagation; subsequently, random noise is resampled to generate pseudo-samples, and the feature matching loss is calculated. The generator parameters are then updated by backpropagation until the model converges.
[0055] S5. Adaptive Threshold Determination: Addressing the pain point that traditional fixed thresholds cannot meet the safety requirements of connected vehicles, we introduced an adaptive threshold search mechanism based on F1 scores during the model validation phase. The system traverses the candidate threshold space and determines the globally optimal classification threshold by calculating the harmonic mean index under different thresholds. Its objective function is as follows: in, This represents the candidate probability threshold during the traversal process, and its value range is... ; This represents the globally optimal decision threshold determined after the search. This threshold will be solidified and deployed to subsequent real-time detection processes. Indicates the current threshold Precision rate, which is the proportion of truly anomalous samples out of all samples judged as anomalous, reflects the model's ability to combat false positives. Indicates the current threshold The recall rate, which is the proportion of all real outliers that are correctly detected, reflects the model's ability to combat false negatives. The formula as a whole represents finding a specific threshold that maximizes the harmonic mean of precision and recall (i.e., the F1 score), thereby achieving dynamic optimization of the overall system performance.
[0056] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the present invention, and all such modifications or substitutions should be covered within the scope of the claims of the present invention.
Claims
1. A method for detecting abnormal behavior in vehicle-to-everything (V2X) networks based on generative adversarial networks, characterized in that: The method includes the following steps: S1: Image-based feature encoding: Collect vehicle basic security messages (BSM), normalize the multi-dimensional temporal physical features, and map them into a fixed-size RGB image tensor; S2: Semi-supervised model construction: Construct a semi-supervised generative adversarial network model consisting of a generator and a discriminator. The generator uses a decoder structure to map random noise into pseudo-samples, and the discriminator integrates a convolutional block attention module and sets up an output layer with a total of K+1 dimensions including the real class and the pseudo-class. S3: Convolutional Neural Network Improvement: Construct a convolutional neural network that combines channel attention and spatial attention, which enables the model to learn the importance weights of different physical feature channels and also captures the spatial features of the image; S4: Semi-supervised adversarial training: The discriminator updates weights by weighted combination of supervised and unsupervised losses, and the generator reduces the distribution difference between real and pseudo samples in the feature layer inside the discriminator by feature matching loss to improve training stability. S5: Adaptive threshold determination: Under the condition of satisfying the preset safety constraints, the probability space is traversed to dynamically search for the optimal classification determination threshold that maximizes the F1 score. S6: Collaborative Deployment Detection: After training is completed on the cloud side, the model parameters are distributed to the roadside unit and the vehicle unit; the vehicle unit performs local high real-time preliminary screening, and the roadside unit performs regional collaborative inference to achieve edge-end hierarchical collaborative detection. S7: Global Decision Response: After the edge detects an anomaly, it generates an error behavior report and uploads it to the cloud side. The cloud-side certificate authority then makes a global final decision and executes the removal of malicious nodes and the revocation of certificates.
2. The method for detecting abnormal behavior in vehicle networks based on generative adversarial networks according to claim 1, characterized in that: In S1, a continuous time window data stream containing position coordinates, instantaneous velocity, acceleration and heading angle features is collected in real time by vehicle-mounted sensors, and the physical features are uniformly mapped to the 0 to 1 range using a minimum-maximum normalization algorithm. A red-green-blue channel mapping strategy is implemented, mapping position features to the red channel, velocity features to the green channel, and acceleration features to the blue channel. A standard temporal image tensor with a size of 32*32*3 is generated using a bilinear interpolation algorithm.
3. The method for detecting abnormal behavior in vehicle networks based on generative adversarial networks according to claim 1, characterized in that: In step S2, the generator receives 128-dimensional random noise, reshapes the features through a fully connected layer, and then progressively upsamples them to the target size via three transposed convolutional layers. Each transposed convolutional layer is followed by a batch normalization layer and a linear correction unit activation layer. The discriminator employs a feature extraction network composed of multiple convolutional blocks connected in series. Each convolutional block integrates two consecutive convolutional layers, a linear correction unit activation layer, a batch normalization layer, a convolutional block attention module, and a random deactivation layer. The discriminator extracts a 256-dimensional feature vector through a global average pooling layer and connects it to a fully connected layer to output a vector containing K-dimensional true class probabilities and 1-dimensional pseudo-class probabilities. During the inference phase, the probability score of a sample belonging to an anomalous class is determined by calculating the normalized exponential function probability value of the first K-dimensional output.
4. The method for detecting abnormal behavior in vehicle networks based on generative adversarial networks according to claim 1, characterized in that: In S3, the convolutional block attention module learns the importance weights of different physical feature channels through cascaded channel attention units, and generates an attention map by performing pooling and convolution operations along the channel axis through the spatial attention unit to locate abnormal pixel regions in the image that violate the laws of physical motion.
5. The method for detecting abnormal behavior in vehicle networks based on generative adversarial networks according to claim 1, characterized in that, In S4, the supervised loss is calculated by the sparse cross-entropy of the labeled real samples on the first K units of the output layer; the unsupervised loss is achieved by maximizing the probability that the real sample belongs to the real class subset and maximizing the probability that the pseudo sample belongs to the pseudo class; the feature matching loss is used to stabilize the adversarial training by calculating the L1 norm distance between the mean of the feature vector of the real sample and the mean of the feature vector of the pseudo sample.
6. The method for detecting abnormal behavior in vehicle networks based on generative adversarial networks according to claim 1, characterized in that, In step S5, during the search process, the candidate probability threshold space is traversed. Under the hard safety constraints of satisfying the lower limit of the target recall rate and the upper limit of the maximum false positive rate, the threshold that maximizes the harmonic average of precision and recall rate is selected as the optimal classification decision threshold.
7. The method for detecting abnormal behavior in vehicle networks based on generative adversarial networks according to claim 1, characterized in that, In S6, a collaborative deployment mode of centralized training and hierarchical inference is adopted. The trained model parameters are distributed to the edge nodes. The vehicle unit uses a sliding window to perform local real-time preliminary screening on the received basic safety message image stream, and the roadside unit aggregates the multi-source detection results within the geographical coverage area to perform regional collaborative inference verification.
8. The method for detecting abnormal behavior in vehicle networks based on generative adversarial networks according to claim 1, characterized in that, In S7, a global adjudication and closed-loop response mechanism is established. When the edge judgment abnormal score exceeds the optimal classification judgment threshold, an error behavior report containing the evidence chain is generated. The cloud-side certificate authority makes the final judgment in combination with the global reputation assessment model. If it is judged to be an attack, the certificate revocation list is updated and issued immediately to realize a secure response to malicious vehicle nodes.