Method and equipment for generating adversarial examples of traffic signs for adversarial training
By performing multi-scale bounding box adjustment and physical adaptation processing on traffic sign images, and combining vehicle, road, and camera imaging models, high-quality adversarial attack samples are generated. This solves the problem of poor robustness of adversarial samples in existing technologies and improves the safety of autonomous driving systems.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- XIDIAN UNIV
- Filing Date
- 2024-06-17
- Publication Date
- 2026-06-30
AI Technical Summary
Existing adversarial example generation methods struggle to fit actual physical transformations during training, resulting in poor robustness of the generated adversarial examples and an inability to effectively improve the safety of autonomous driving systems.
By performing multi-scale bounding box adjustment, lateral compression, and pre-defined physical adaptation processing on traffic sign images, combined with vehicle road models and camera imaging models, traffic sign adaptation samples are generated. Furthermore, adversarial training is performed using style loss, content loss, and smoothness loss to generate high-quality adversarial attack samples.
It improves the quality and concealment of adversarial attack samples, enhances the adversarial detection effect of autonomous driving systems, and avoids the problem of poor concealment caused by overfitting in existing methods.
Smart Images

Figure CN118629008B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of artificial intelligence and cybersecurity technology, specifically relating to a method and device for generating adversarial samples of traffic signs for adversarial training. Background Technology
[0002] Autonomous driving technology aims to enable vehicles to autonomously complete driving tasks, improving driving efficiency and safety. With nearly 50 years of development, autonomous driving technology has become one of the important development directions in the automotive industry. As technology continues to advance and policies gradually loosen, autonomous vehicles are being applied in more scenarios. The future development trend is towards higher levels of automation, intelligence, and connectivity, and deep integration with intelligent transportation systems.
[0003] While autonomous driving technology is developing rapidly, safety issues are becoming increasingly prominent, posing a key obstacle to its further widespread application. Although autonomous driving technology can improve driving efficiency and safety, the consequences of technical malfunctions or system errors could be disastrous. Therefore, while ensuring the continuous advancement of autonomous driving technology, it is crucial to prioritize its safety, strengthen technological research and development and testing, and ensure that autonomous driving systems maintain stable and reliable operation in various complex environments and unforeseen circumstances.
[0004] Adversarial sample attacks on traffic signs are misleading attacks targeting the visual object detectors of intelligent vehicles. They involve adding pixel-by-pixel perturbations to ordinary traffic sign images, making the resulting samples undetectable by the autonomous driving detector or misclassifying them as incorrect traffic sign categories. However, to the human eye, these samples still appear to be the original traffic sign categories. For autonomous vehicles, the inability to detect traffic sign categories or misclassification can lead to serious traffic accidents. For example, failing to detect a speed limit sign, or misclassifying a speed limit of 60 as a speed limit of 100, can result in serious speeding. Therefore, researching adversarial sample attacks on traffic signs is crucial to enhancing the safety of autonomous vehicles.
[0005] Existing adversarial example generation methods generally adapt to actual physical transformations by fitting target transformations during training. However, the generated adversarial examples can only fit as many transformations as possible, which not only makes fitting difficult during training, but may also overfit to some situations that are impossible to encounter when the vehicle is driving. This results in poor robustness of the generated adversarial examples, and consequently, poor adversarial performance of the adversarial model trained using these adversarial examples. Summary of the Invention
[0006] To address the aforementioned problems in the prior art, this invention provides a method and apparatus for generating adversarial examples of traffic signs for adversarial training.
[0007] The technical problem to be solved by this invention is achieved through the following technical solution:
[0008] In a first aspect, the present invention provides a method for generating adversarial examples of traffic signs for adversarial training, comprising:
[0009] Obtain traffic sign images and attack vectors;
[0010] The traffic sign images are sequentially subjected to multi-scale bounding box adjustment, lateral compression, and preset physical adaptation processing to obtain traffic sign adaptation samples.
[0011] Image embedding is performed on the traffic sign adaptation samples to obtain embedded samples;
[0012] The embedded sample is input into the first target detection model, and the embedded sample is iteratively processed based on the first target detection model and the attack vector to obtain the adversarial attack sample.
[0013] The adversarial attack sample is input into the second target detection model, and the second target detection model is adversarially trained to obtain an adversarial second target detection model.
[0014] The first target detection model is an attack-side detection model with its node parameters frozen; the second target detection model is an adversarial-side detection model with its node parameters not frozen; the preset loss function of the first target detection model is used to calculate the sum of style loss, content loss, adversarial loss, and smoothing loss; the traffic sign adaptation samples are obtained based on the vehicle road model and camera imaging model; the values of style loss and content loss are obtained based on the style extraction model.
[0015] Optionally, the traffic sign image is sequentially subjected to multi-scale bounding box adjustment, lateral compression, and preset physical adaptation processing to obtain traffic sign adaptation samples, including:
[0016] Based on the vehicle road model and preset attack parameters, the traffic sign image is adjusted at multiple scales to obtain the first target image;
[0017] The second target image is obtained by performing lateral compression processing on the first target image based on the camera imaging model.
[0018] The second target image is subjected to a preset physical adaptation process to obtain traffic sign adaptation samples.
[0019] Optionally, the preset attack parameters include: preset attack range and preset attack target;
[0020] The preset attack range includes: maximum detection distance D max and the shortest processing distance D min .
[0021] Optionally, the preset physical adaptation processing includes: brightness transformation, contrast transformation, saturation transformation, resolution transformation, Gaussian noise transformation, and Gaussian blur transformation.
[0022] Optionally, the embedded sample is input into a first target detection model, and the embedded sample is iteratively processed based on the first target detection model and the attack vector to obtain adversarial attack samples, including:
[0023] Input the embedded sample into the first target detection model;
[0024] The embedded samples are iteratively processed based on the first target detection model and the preset loss function corresponding to the attack vector;
[0025] Embedded samples whose loss value of the preset loss function is less than a preset threshold are used as adversarial attack samples.
[0026] Optionally, the preset loss function includes: a first preset loss or a second preset loss; the attack vector includes: a hidden attack or a target attack;
[0027] Iterative processing of embedded samples is performed based on the first object detection model and the preset loss function corresponding to the attack vector, including:
[0028] When the attack vector is a hidden attack, the embedded samples are iteratively processed based on the first target detection model and the first preset loss.
[0029] When the attack vector is a target attack, the embedded sample is iteratively processed based on the first target detection model and the second preset loss; when the hidden attack is an adversarial attack sample, the probability of the target object output by the first target detection model is less than the threshold; when the target attack is an adversarial attack sample, the probability of the non-target object being detected by the first target detection model is greater than the threshold.
[0030] Optionally, the calculation process corresponding to the values of style loss and content loss includes:
[0031] Acquire traffic sign images and target style images;
[0032] The traffic sign image and the target style image are input into the style extraction model to perform style overlay processing on the traffic sign image to obtain the style traffic sign image;
[0033] The style loss value is calculated based on the style difference between the style traffic sign image and the target style image, as well as the style loss.
[0034] The content loss value is calculated based on the content differences between style traffic sign images and traffic sign images, as well as the content loss itself.
[0035] Optionally, the corresponding bounding box size adjustment for multi-scale bounding boxes is represented as follows:
[0036]
[0037] in, The bounding box size is represented by ε, where ε represents the ε-th scale, M represents the total number of scales, f represents the camera focal length, L represents the size of the traffic sign image, and P represents the size of the bounding box at the ε-th scale. D D represents the pixel density of a camera. max D represents the furthest detection distance. min Indicates the shortest processing distance;
[0038] The compression angle of lateral compression processing is expressed as:
[0039]
[0040] The compression ratio of lateral compression processing is expressed as:
[0041] r = cosa;
[0042] r represents the compression ratio of the lateral compression process, a represents the compression angle of the lateral compression process, and W represents the perpendicular distance between the traffic sign image and the straight line containing the vehicle's direction of travel.
[0043] Optionally, the first preset loss is expressed as:
[0044]
[0045] Among them, L H Indicates the first preset loss. L represents the first adversarial loss corresponding to the first preset loss. style L represents style loss. content Indicates content loss, L smooth Let P represent the smoothing loss, m represent the total number of background images corresponding to the image embedding, n represent the number of traffic sign adaptation samples embedded in each background image, k represent the number of bounding boxes retained after bounding box filtering during the detection process, and P represent the number of bounding boxes retained after bounding box filtering during the detection process. ijl w represents the probability that the j-th traffic sign adaptation sample in the i-th background image exists in the l-th bounding box obtained after bounding box filtering. a w represents the adversarial loss coefficient. s w represents the style loss coefficient. c w represents the content loss coefficient. m Indicates the smoothing loss coefficient;
[0046] The second presupposed loss is expressed as:
[0047]
[0048] Among them, L T Indicates the second preset loss. V represents the second adversarial loss corresponding to the second preset loss. ijl y' V represents the classification confidence of the attack target category y'. ijl z represents the classification confidence of categories z other than y', and N represents the total number of categories of the adversarial attack samples.
[0049] In a second aspect, the present invention provides a traffic sign adversarial example generation device for adversarial training, comprising: a processor, a storage medium and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, and when the traffic sign adversarial example generation device for adversarial training is running, the processor communicates with the storage medium via the bus, and the processor executes the machine-readable instructions to perform the steps of the traffic sign adversarial example generation method for adversarial training as described in the first aspect above.
[0050] This invention provides a method and apparatus for generating adversarial examples of traffic signs for adversarial training. The method includes: acquiring traffic sign images and attack vectors; sequentially performing multi-scale bounding box adjustment, lateral compression, and preset physical adaptation processing on the traffic sign images to obtain adapted traffic sign samples; embedding the adapted traffic sign samples into images to obtain embedded samples; inputting the embedded samples into a first target detection model, and iteratively processing the embedded samples based on the first target detection model and the attack vectors to obtain adversarial attack samples; inputting the adversarial attack samples into a second target detection model, and adversarially training the second target detection model to obtain an adversarial second target detection model; the first target detection model is an attack-side detection model with frozen node parameters; the second target detection model is an adversarial-side detection model with unfrozen node parameters; a preset loss function of the first target detection model is used to calculate the sum of style loss, content loss, adversarial loss, and smoothing loss; the adapted traffic sign samples are obtained based on a vehicle road model and a camera imaging model; the values of style loss and content loss are obtained based on a style extraction model. In this invention, traffic sign adaptation samples are obtained by using vehicle road models and camera imaging models based on actual roads and shooting equipment, making the final generated adversarial attack samples more in line with the actual scene and improving the quality of adversarial attack samples. Secondly, adversarial attack samples are obtained through the combined effect of style loss, content loss, adversarial loss, and smoothing loss, avoiding the problem that existing physical world adversarial attack methods add excessive perturbations to enhance robustness, resulting in poor concealment of adversarial attack samples. Finally, adversarial training of the second target detection model is carried out based on the improved quality and concealment of adversarial attack samples, thereby improving the adversarial detection effect of the second target detection model.
[0051] The present invention will be further described in detail below with reference to the accompanying drawings and embodiments. Attached Figure Description
[0052] Figure 1 A flowchart illustrating a traffic sign adversarial sample generation method for adversarial training provided in an embodiment of the present invention;
[0053] Figure 2 A flowchart illustrating a traffic sign adversarial sample generation method for adversarial training, provided in an embodiment of the present invention.
[0054] Figure 3 A schematic diagram of a vehicle road model provided in an embodiment of the present invention;
[0055] Figure 4 A schematic diagram of a camera imaging model provided in an embodiment of the present invention;
[0056] Figure 5 A flowchart illustrating a traffic sign adversarial sample generation method for adversarial training, provided in an embodiment of the present invention.
[0057] Figure 6 The attack success rate results of the adversarial attack sample of the present invention under covert attack (HA) and target attack (TA) provided for the embodiments of the present invention are shown in the figure.
[0058] Figure 7 Adversarial attack samples generated by various methods provided in the embodiments of the present invention;
[0059] Figure 8 This is a schematic diagram of a traffic sign adversarial sample generation device for adversarial training, provided in an embodiment of the present invention. Detailed Implementation
[0060] The present invention will be further described in detail below with reference to specific embodiments, but the implementation of the present invention is not limited thereto.
[0061] To improve the adversarial detection performance of adversarial models, this invention provides a method for generating adversarial samples of traffic signs for adversarial training. Figure 1 This is a flowchart illustrating a method for generating adversarial examples of traffic signs for adversarial training, provided in an embodiment of the present invention. Figure 1 As shown, it includes:
[0062] S101. Obtain traffic sign images and attack vectors.
[0063] S102. Perform multi-scale bounding box adjustment, lateral compression, and preset physical adaptation processing on the traffic sign image in sequence to obtain traffic sign adaptation samples.
[0064] Optionally, S102 may specifically include:
[0065] Based on the vehicle road model and preset attack parameters, the traffic sign image is adjusted at multiple scales to obtain the first target image;
[0066] The second target image is obtained by performing lateral compression processing on the first target image based on the camera imaging model.
[0067] The second target image is subjected to a preset physical adaptation process to obtain traffic sign adaptation samples.
[0068] Optionally, the preset attack parameters include: preset attack range and preset attack target;
[0069] The preset attack range includes: maximum detection distance D max and the shortest processing distance D min .
[0070] It should be noted that the maximum detection distance D max This refers to the farthest distance that the detection equipment can detect, and the shortest processing distance D. min It refers to the shortest distance allowed by traffic regulations. For example, for a stop sign on a pedestrian crossing, it is the shortest distance that traffic laws allow a vehicle to stop.
[0071] Optionally, the preset physical adaptation processing includes: brightness transformation, contrast transformation, saturation transformation, resolution transformation, Gaussian noise transformation, and Gaussian blur transformation.
[0072] S103. Image embedding is performed on the traffic sign adaptation samples to obtain embedded samples.
[0073] S104. Input the embedded sample into the first target detection model, and iteratively process the embedded sample based on the first target detection model and the attack vector to obtain the adversarial attack sample.
[0074] Optionally, S104 may specifically include:
[0075] Input the embedded sample into the first target detection model;
[0076] The embedded samples are iteratively processed based on the first target detection model and the preset loss function corresponding to the attack vector;
[0077] Embedded samples whose loss value of the preset loss function is less than a preset threshold are used as adversarial attack samples.
[0078] S105. Input the adversarial attack sample into the second target detection model and perform adversarial training on the second target detection model to obtain an adversarial second target detection model.
[0079] The first target detection model is an attack-side detection model with its node parameters frozen; the second target detection model is an adversarial-side detection model with its node parameters not frozen; the preset loss function of the first target detection model is used to calculate the sum of style loss, content loss, adversarial loss, and smoothing loss; the traffic sign adaptation samples are obtained based on the vehicle road model and camera imaging model; the values of style loss and content loss are obtained based on the style extraction model.
[0080] To clearly illustrate the overall process of a traffic sign adversarial example generation method for adversarial training, Figure 2 This is a flowchart illustrating a method for generating adversarial examples of traffic signs for adversarial training, provided as an embodiment of the present invention. Figure 2 As shown, after the traffic sign image (speed limit 40 sign) undergoes bounding box pre-selection, camera offset and viewpoint offset modeling, and other physical adaptation processes sequentially in the physics adversarial attack module, an adapted traffic sign sample is obtained. This adapted traffic sign sample is then embedded into the actual physical scene to obtain an embedded sample. This embedded sample is input into the target detection system (first target detection model), and adversarial samples (adversarial attack samples) are obtained based on the adversarial loss calculation results. Here, L represents the loss function of the first target detection model, L... style L represents style loss. content Indicates content loss, L smooth L represents the smoothing loss. adv Indicates resistance to loss. L style and L content The value is determined based on the style transfer module. Specifically, the traffic sign image (speed limit 40 sign) and the style image are input into the style extraction model. Style is added to the traffic sign image, and L is obtained based on the difference between the generated style traffic sign image and the original traffic sign image. style and L content The value of . In addition, a smoothing loss was added to the physical adversarial attack module to enable the adversarial attack sample to perform difference calculations on its own neighboring pixels, ultimately ensuring that the adversarial attack sample has a locally smooth appearance mode.
[0081] Furthermore, it should be noted that in the physical adversarial attack module, bounding box pre-selection and camera view offset modeling are performed based on the vehicle and road model proposed in this embodiment of the invention. This ensures that the adversarial attack samples created in this embodiment of the invention can concentrate the attack on the attacker's designed attack range, and can simulate attacks in a real-world environment, avoiding overfitting of adversarial samples to distances and angles outside the attack range.
[0082] Figure 3 This is a schematic diagram of a vehicle road model provided in an embodiment of the present invention. Figure 4 This is a schematic diagram of a camera imaging model provided in an embodiment of the present invention. Figure 3 As shown, the vehicle road model provided in this embodiment of the invention refers to the detection of traffic sign images by the camera during vehicle travel (including distance and angle, etc.), rather than a vehicle driving simulator. Here, let D (unit: meters) be the distance between the projection of the traffic sign image onto the vehicle's travel direction and the vehicle's current position, L (unit: centimeters) be the size of the traffic sign image, f represent the camera focal length (unit: millimeters), and P... D The pixel density of the camera (unit: pixels / cm) indicates the final size of the traffic sign image displayed in the camera frame, measured in pixels per centimeter (s). p (Unit: pixels), then according to Figure 3 Based on the imaging principle shown, the relationship between the imaging size and distance of the traffic sign-adapted sample can be obtained as follows:
[0083]
[0084] Combination Figure 3 and Figure 4 Let d (in meters) be the straight-line distance between the vehicle and the traffic sign image, W (in meters) be the length of the perpendicular line from the target (traffic sign image) to the straight line in the direction of the vehicle's travel, and a (in radians, a∈(0,π / 2)) be the angle of offset of the target relative to the front of the vehicle. Then the relationship between distance D and angle a is as follows:
[0085]
[0086] Substituting formula (2) into formula (1), we can obtain the angle α and the bounding box size s. p Relationship:
[0087]
[0088] Based on equations (1) and (3), and by substituting other values under the attack scenario, the horizontal compression ratio r = cosa is calculated. This guides the physical adaptation during the training process of adversarial attack samples, ensuring that the aggressiveness of adversarial attack samples is mainly fitted within the attack range selected by the attacker, and avoiding overfitting of adversarial attack samples to unreasonable distances.
[0089] Furthermore, in this embodiment of the invention, when a preset attack range (including: the furthest detection distance D) is determined... max and the shortest processing distance D min After that, the preset attack range can be substituted into formula (1), and the entire scale can be divided into M equal parts to obtain the multi-scale bounding box size.
[0090] Because once the bounding box size of the traffic sign image and the distance between the vehicle and the curb are determined, the distance between the vehicle and the traffic sign image, the pixel size of the traffic sign image in the frame, and the angular offset of the traffic sign image relative to the camera direction are in a one-to-one correspondence. Therefore, once the pixel size is selected... Then, the compression angle α of the transverse compression process can be determined. ε :
[0091]
[0092] In this embodiment of the invention, it can be based on the compression angle α i This simulation guides the embedding of traffic sign images into background images, focusing on angular offset. The vertical angular offset is largely consistent with the horizontal offset.
[0093] It should be noted that in this embodiment of the invention, YOLOv5 can be used as the first object detection model. The VGG-19 model (Karen S. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556,2014.) is used as the style extractor and content extractor. The style loss L is calculated using the style information from the style image and the style traffic sign image. style Furthermore, the content loss L is calculated using the content information of the style traffic sign images and the original traffic sign images obtained during the iteration process. content .
[0094] Optionally, the preset loss function includes: a first preset loss or a second preset loss; the attack vector includes: a hidden attack or a target attack;
[0095] Iterative processing of embedded samples is performed based on the first object detection model and the preset loss function corresponding to the attack vector, including:
[0096] When the attack vector is a hidden attack, the embedded samples are iteratively processed based on the first target detection model and the first preset loss.
[0097] When the attack vector is a targeted attack, the embedded samples are iteratively processed based on the first target detection model and the second preset loss. For a hidden attack, the probability of the target object output by the first target detection model after detecting the adversarial attack sample is less than a threshold. For a targeted attack, the probability of the adversarial attack sample being detected as a non-target object by the first target detection model is greater than a threshold. It should be noted that the target object is the category that the first detection model is intended to identify the adversarial attack sample as.
[0098] Optionally, the calculation process corresponding to the values of style loss and content loss includes:
[0099] Acquire traffic sign images and target style images;
[0100] The traffic sign image and the target style image are input into the style extraction model to perform style overlay processing on the traffic sign image to obtain the style traffic sign image;
[0101] The style loss value is calculated based on the style difference between the style traffic sign image and the target style image, as well as the style loss.
[0102] The content loss value is calculated based on the content differences between style traffic sign images and traffic sign images, as well as the content loss itself.
[0103] Optionally, the corresponding bounding box size adjustment for multi-scale bounding boxes is represented as follows:
[0104]
[0105] in, The bounding box size is represented by ε, where ε represents the ε-th scale, M represents the total number of scales, f represents the camera focal length, L represents the size of the traffic sign image, and P represents the size of the bounding box at the ε-th scale. D D represents the pixel density of a camera. max D represents the furthest detection distance. min Indicates the shortest processing distance;
[0106] The compression angle of lateral compression processing is expressed as:
[0107]
[0108] The compression ratio of lateral compression processing is expressed as:
[0109] r = cosa;
[0110] r represents the compression ratio of the lateral compression process, a represents the compression angle of the lateral compression process, and W represents the perpendicular distance between the traffic sign image and the straight line containing the vehicle's direction of travel.
[0111] Optionally, the first preset loss is expressed as:
[0112]
[0113] Among them, L H Indicates the first preset loss. L represents the first adversarial loss corresponding to the first preset loss. style L represents style loss. content Indicates content loss, L smoothLet P represent the smoothing loss, m represent the total number of background images corresponding to the image embedding, n represent the number of traffic sign adaptation samples embedded in each background image, k represent the number of bounding boxes retained after bounding box filtering during the detection process, and P represent the number of bounding boxes retained after bounding box filtering during the detection process. ijl w represents the probability that the j-th traffic sign adaptation sample in the i-th background image exists in the l-th bounding box obtained after bounding box filtering. a w represents the adversarial loss coefficient. s w represents the style loss coefficient. c w represents the content loss coefficient. m Indicates the smoothing loss coefficient;
[0114] The second presupposed loss is expressed as:
[0115]
[0116] Among them, L T Indicates the second preset loss. V represents the second adversarial loss corresponding to the second preset loss. ijl y' V represents the classification confidence of the attack target category y'. ijl z represents the classification confidence of categories z other than y', and N represents the total number of categories of the adversarial attack samples.
[0117] It should be noted that V ijl y' The initial value of V is often a very small value (on the order of magnitude of 1e-5). If it is placed in the denominator of the loss function, the loss function value can easily become too large during backpropagation, or even become NaN, which cannot be avoided by adjusting the weight values. Therefore, in this embodiment of the invention, the loss function for adversarial attack samples uses the form of mean squared error. Wherein, V ijl y' The final expected value is 1, while V ijl z The final expected value is 0. Using this loss function can guarantee that training failure caused by an excessively large loss function value can be prevented in almost all cases.
[0118] Traditional adversarial attacks use the Lp norm (usually L2 or L∞ norm) to limit the visibility of the attack. However, in physical world attack scenarios, to ensure the robustness of the attack, the attack perturbation strength is often greater than in the digital environment, and pixel-by-pixel perturbations are easily detected by the human eye. In this embodiment of the invention, style transfer technology is used to disguise the embedded samples. Based on traffic sign images and style images, a style extraction model is used to obtain a style traffic sign image, and the mean squared error between the Gram matrices of the two (style traffic sign image and target style image) is calculated. This error is used to optimize the style features of the adversarial attack sample.
[0119]
[0120] S represents the set of layers in the feature extractor used to calculate the style loss, ls represents the ls-th layer in the feature extractor used to calculate the style loss, and x' and x style Let G(x′) represent the style images of the adversarial attack samples and the adversarial attack samples during the iteration process, respectively. ls G(x) represents the Gram matrix extracted by x' in the ls layer. style ) ls x represents style Gram matrix extracted from the ls layer.
[0121] In addition, to ensure that the content information of the adversarial attack samples is preserved as much as possible, optimization is needed to address the content loss of the adversarial attack samples:
[0122]
[0123] x is the original embedded sample, C(x) represents the feature map output by the style extraction model for x, and C(x′) represents the feature map output by the style extraction model for x′.
[0124] Smoothing loss is also a very important loss term. Smoothing loss can mitigate the high-frequency components of the perturbation, ensuring that the generated adversarial attack sample has a smooth appearance pattern as much as possible. This ensures that even when the distance is far and some pixels of the adversarial attack sample are lost, it can still maintain a good attack effect.
[0125]
[0126] Where, x′ p,q This represents the pixel at coordinates (p, q) of image x'.
[0127] This invention provides a method for generating adversarial examples of traffic signs for adversarial training, comprising: acquiring traffic sign images and attack vectors; sequentially performing multi-scale bounding box adjustment, lateral compression, and preset physical adaptation processing on the traffic sign images to obtain traffic sign adapted samples; embedding the traffic sign adapted samples into images to obtain embedded samples; inputting the embedded samples into a first target detection model, and iteratively processing the embedded samples based on the first target detection model and the attack vectors to obtain adversarial attack samples; inputting the adversarial attack samples into a second target detection model, and adversarially training the second target detection model to obtain an adversarial second target detection model; the first target detection model is an attack-side detection model with frozen node parameters; the second target detection model is an adversarial-side detection model with unfrozen node parameters; a preset loss function of the first target detection model is used to calculate the sum of style loss, content loss, adversarial loss, and smoothing loss; the traffic sign adapted samples are obtained based on a vehicle road model and a camera imaging model; the values of style loss and content loss are obtained based on a style extraction model. In this embodiment of the invention, traffic sign adaptation samples are obtained by using a vehicle road model and camera imaging model based on actual roads and shooting equipment, making the final generated adversarial attack samples more in line with the actual scene and improving the quality of the adversarial attack samples. Secondly, adversarial attack samples are obtained through the combined effect of style loss, content loss, adversarial loss and smoothing loss, avoiding the problem that existing physical world adversarial attack methods add excessive perturbations to enhance robustness, resulting in poor concealment of adversarial attack samples. Finally, adversarial training of the second target detection model is carried out on the basis of improved adversarial attack sample quality and concealment, thereby improving the adversarial detection effect of the second target detection model.
[0128] Figure 5 This is a flowchart illustrating a traffic sign adversarial example generation method for adversarial training, provided as an embodiment of the present invention. Figure 5 As shown, firstly, traffic sign images are acquired. On the far left, a model is constructed based on the traffic sign image according to a preset attack range, transformation parameters are selected, and a background image is embedded to obtain embedded samples. These embedded samples are then input into the first target detection model for detection, yielding the current adversarial attack sample. The adversarial attack loss is then calculated based on this sample. On the right, a smoothing loss is calculated based on the target image pixel values. In the middle section, style loss and content loss are calculated based on the feature extractor. Finally, the adversarial attack sample is updated through backpropagation based on the overall loss.
[0129] To verify the reliability of the adversarial attack samples generated based on the method of the embodiments of the present invention, simulation experiments were also conducted.
[0130] First, experiments verified the attack success rate of adversarial attack samples under different lighting conditions, distances, and camera perspectives for both hidden attack (HA) and targeted attack (TA). The results are as follows: Figure 6 As shown. Figure 6 The number and color depth in each cell represent the attack success rate; the darker the color, the higher the success rate. According to... Figure 6 It can be seen that for both attack vectors, the highest attack success rate can be achieved under good lighting conditions and with small camera viewpoint shifts. Even under poor lighting conditions or with large viewpoint shifts, the adversarial attack samples can still maintain a certain level of attack effectiveness.
[0131] Secondly, to compare with existing adversarial attack methods, this embodiment of the invention also reproduced several currently best-performing physical environment adversarial attack methods in an experimental environment. During training, the same training parameters (including traffic sign images and style images, loss function weights, range of randomly selected parameters, pre-selected attack distance, etc.) were used, and the attack success rates of different methods were compared under the same illumination, distance, and deflection angle. The experimental results are shown in Table 1:
[0132] Table 1. Attack success rates of different methods under the same experimental conditions.
[0133]
[0134] As shown in Table 1, the method proposed in this invention achieves the best attack success rate within the pre-selected attack range (20m-50m). This indicates that the adversarial attack samples obtained by the method of this invention can achieve better attack effects than existing methods. Furthermore, the higher quality adversarial attack samples directly improve the training effect of the adversarial training of the second target detection model.
[0135] Furthermore, to verify the effectiveness of the compression ratio proposed in this embodiment, an adversarial attack sample was trained using the method of this invention with and without compression ratio. The method without compression ratio randomly selected an angle offset value from the common 0-60° range when fitting the angle. Then, tests were conducted under simulated real-world road conditions. In this scenario, a common 3-lane model found in Chinese urban roads was selected, with each lane being 3.5m wide, and the adversarial attack sample was placed at the road edge. The camera was directed directly forward along the road while moving forward from the center of each lane. The test results are shown in Table 2.
[0136] Table 2 shows the impact of compression ratio on the success rate of adversarial attack samples.
[0137]
[0138] As shown in Table 2, the adversarial attack samples trained with added compression ratios can achieve a higher attack success rate in almost all cases than the adversarial attack samples trained with random angle offset values without compression ratios.
[0139] In addition to the attack success rate, this embodiment of the invention also evaluates the naturalness of adversarial attack samples using the Adversarial Attack Sample Naturalness Evaluation Model (DPA). The DPA model fits human evaluation scores for adversarial overlay camouflage, providing an objective evaluation that is closest to human perspective. Figure 7 These are adversarial attack samples generated using various methods provided in embodiments of the present invention. Figure 7 Figure (a) is the original image of the traffic sign. Figure 7 Figure (b) shows an adversarial attack sample generated by the method of the present invention. Figure 7 Figure (c) shows an adversarial attack sample generated by the AdvCam method. Figure 7 Figure (d) shows an adversarial attack sample generated by the FTE method. Figure 7 Figure (e) shows an adversarial attack sample generated by the FTE (Reproducibility) method. Figure 7 Figure (f) shows adversarial attack samples generated by the system-level attack method. Table 3 shows the adversarial attack samples provided by the embodiments of the present invention based on... Figure 7 The results are the evaluation results given by the naturalness assessment model DPA.
[0140] Table 3 Evaluation results of the Naturalness Assessment Model (DPA)
[0141] Sample source DPA score Original image 2.9548 This invention 2.4752 AdvCam 2.4304 FTE 2.3381 FTE (Reproduced) 2.3892 System-level attack 2.3264
[0142] Table 3 shows the DPA score for the naturalness of adversarial attack samples (1-5 points, with 5 being the highest). The evaluation results in Table 3 show that the method used in this invention can create visually concealed adversarial attack samples and achieves the highest DPA score, verifying the sample quality of the adversarial attack samples in this embodiment. Furthermore, the second target detection model trained based on these adversarial attack samples also exhibits superior adversarial detection performance.
[0143] This invention utilizes a real road model for modeling. During the training process of creating adversarial attack samples, this real road model guides the selection of transformation parameters when embedding the adversarial attack samples into the background image. This allows the adversarial attack samples to adapt to actual physical transformations during training, ensuring that they maintain strong adversarial capabilities even after undergoing various physical transformations in a real physical environment. Furthermore, addressing the issue that samples with added physical adaptations often exhibit poor concealment and low naturalness, this invention employs an image style transfer method to hide conspicuous adversarial perturbations within a natural style mode. This ensures that the adversarial samples possess sufficient concealment and are not detected by the human eye before the attack takes effect. Through research on adversarial attack samples of traffic signs, this invention's method can guide the target detector (second target detection model) of intelligent vehicles in defensive training against adversarial attack samples, laying the foundation for safe driving in intelligent vehicles.
[0144] The method provided in this embodiment of the invention can be applied to electronic devices. Specifically, the electronic device can be a desktop computer, a portable computer, a smart mobile terminal, a server, etc., and this embodiment of the invention does not limit the application to such devices.
[0145] Based on the same inventive concept, embodiments of the present invention also provide a traffic sign adversarial sample generation device for adversarial training. Figure 8 A schematic diagram of a traffic sign adversarial example generation device for adversarial training provided in an embodiment of the present invention includes: a processor 710, a storage medium 720, and a bus 730. The storage medium 720 stores machine-readable instructions executable by the processor 710. When the traffic sign adversarial example generation device for adversarial training is running, the processor 710 communicates with the storage medium 720 via the bus 730, and the processor 710 executes the machine-readable instructions to perform the steps of the above-described method embodiment. Specific implementation methods and technical effects are similar and will not be described in detail here.
[0146] The storage medium may include random access memory (RAM) or non-volatile memory (NVM), such as at least one disk storage device. Optionally, the storage medium may also be at least one storage device located remotely from the aforementioned processor.
[0147] The processors mentioned above can be general-purpose processors, including central processing units (CPUs), network processors (NPs), etc.; they can also be digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
[0148] It should be noted that the terms "first," "second," etc., are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of the invention described herein can be implemented in orders other than those illustrated or described herein. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatuses and methods consistent with some aspects of the invention.
[0149] In the description of this specification, the references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., refer to specific features or characteristics described in connection with that embodiment or example, which are included in at least one embodiment or example of the present invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Moreover, the specific features or characteristics described may be combined in any suitable manner in one or more embodiments or examples. Furthermore, those skilled in the art can combine and integrate the different embodiments or examples described in this specification.
[0150] Although the invention has been described herein in conjunction with various embodiments, those skilled in the art will understand and implement other variations of the disclosed embodiments by reviewing the accompanying drawings and the disclosure in carrying out the claimed invention. In the description of the invention, the word "comprising" does not exclude other components or steps, "a" or "an" does not exclude a plurality, and "a plurality" means two or more, unless otherwise explicitly specified. Furthermore, while different embodiments may describe certain measures, this does not mean that these measures cannot be combined to produce good results.
[0151] The above description, in conjunction with specific preferred embodiments, provides a further detailed explanation of the present invention. It should not be construed that the specific implementation of the present invention is limited to these descriptions. For those skilled in the art, various simple deductions or substitutions can be made without departing from the concept of the present invention, and all such modifications and substitutions should be considered within the scope of protection of the present invention.
Claims
1. A method for generating adversarial examples of traffic signs for adversarial training, characterized in that, include: Obtain traffic sign images and attack vectors; The traffic sign image is sequentially subjected to multi-scale bounding box adjustment, lateral compression, and preset physical adaptation processing to obtain traffic sign adaptation samples. Image embedding is performed on the traffic sign adaptation sample to obtain the embedded sample; The embedded sample is input into the first target detection model, and the embedded sample is iteratively processed based on the first target detection model and the attack vector to obtain the adversarial attack sample; The adversarial attack sample is input into the second target detection model, and the second target detection model is adversarially trained to obtain an adversarial second target detection model; The first target detection model is an attack-side detection model with its node parameters frozen; the second target detection model is an adversarial-side detection model with its node parameters not frozen; the preset loss function of the first target detection model is used to calculate the sum of style loss, content loss, adversarial loss, and smoothing loss; the traffic sign adaptation samples are obtained based on the vehicle road model and camera imaging model; the values of the style loss and the content loss are obtained based on the style extraction model; The step of inputting the embedded sample into a first target detection model and iteratively processing the embedded sample based on the first target detection model and the attack vector to obtain an adversarial attack sample includes: The embedded sample is input into the first target detection model; The embedded samples are iteratively processed based on the first target detection model and the preset loss function corresponding to the attack vector; The embedded sample corresponding to the loss value of the preset loss function being less than a preset threshold is used as the adversarial attack sample.
2. The method for generating adversarial examples of traffic signs for adversarial training according to claim 1, characterized in that, The traffic sign image is sequentially subjected to multi-scale bounding box adjustment, lateral compression, and preset physical adaptation processing to obtain traffic sign adaptation samples, including: Based on the vehicle road model and preset attack parameters, the traffic sign image is adjusted at multiple scales to obtain the first target image; The first target image is horizontally compressed based on the camera imaging model to obtain the second target image; The second target image is subjected to a preset physical adaptation process to obtain the traffic sign adaptation sample.
3. The method for generating adversarial examples of traffic signs for adversarial training according to claim 2, characterized in that, The preset attack parameters include: preset attack range and preset attack target; The preset attack range includes: the furthest detection distance. and shortest processing distance .
4. A method for generating adversarial examples of traffic signs for adversarial training according to claim 1 or 2, characterized in that, The preset physical adaptation processing includes: brightness transformation, contrast transformation, saturation transformation, resolution transformation, Gaussian noise transformation, and Gaussian blur transformation.
5. A method for generating adversarial examples of traffic signs for adversarial training according to claim 1, characterized in that, The preset loss function includes: a first preset loss or a second preset loss; the attack vector includes: a stealth attack or a target attack; The iterative processing of the embedded samples based on the first target detection model and the preset loss function corresponding to the attack vector includes: When the attack vector is a stealth attack, the embedded sample is iteratively processed based on the first target detection model and the first preset loss. When the attack vector is a target attack, the embedded sample is iteratively processed based on the first target detection model and the second preset loss; the hidden attack is when the probability of the target object output by the adversarial attack sample after being detected by the first target detection model is less than a threshold; the target attack is when the adversarial attack sample is detected as a non-target object by the first target detection model and the probability of the non-target object is greater than a threshold.
6. The method for generating adversarial examples of traffic signs for adversarial training according to claim 1, characterized in that, The calculation process for the values of style loss and content loss includes: Acquire traffic sign images and target style images; The traffic sign image and the target style image are input into the style extraction model to perform style overlay processing on the traffic sign image to obtain a styled traffic sign image. The style loss value is calculated based on the style difference and style loss between the style traffic sign image and the target style image; The value of the content loss is calculated based on the content differences and content loss between the traffic sign image of the specified style and the traffic sign image.
7. A method for generating adversarial examples of traffic signs for adversarial training according to claim 2, characterized in that, The adjustment of the bounding box size corresponding to the multi-scale bounding box adjustment is represented as follows: ; in, Indicates the first Bounding box size at each scale Indicates the first One scale, Indicates the total number of scales. Indicates the camera's focal length. Indicates the size of the traffic sign image. This indicates the camera's pixel density. Indicates the farthest detection distance. Indicates the shortest processing distance; The compression angle of lateral compression processing is expressed as: ; The compression ratio of lateral compression processing is expressed as: ; This indicates the compression ratio in the lateral compression process. Indicates the compression angle during lateral compression processing. It indicates the perpendicular distance between the traffic sign image and the straight line containing the direction of vehicle travel.
8. A method for generating adversarial examples of traffic signs for adversarial training according to claim 5, characterized in that, The first preset loss is expressed as: ; in, Indicates the first preset loss. This represents the first adversarial loss corresponding to the first preset loss. Indicating style loss, Indicates content loss. Indicates smoothing loss. This indicates the total number of background images corresponding to the image embedding. It is the number of traffic sign adaptation samples embedded in each background image. This indicates the number of bounding boxes retained after bounding box filtering during the detection process. Indicates the first The first background image The traffic sign adaptation sample obtained after bounding box filtering is the [number]th [sample]. The probability that a traffic sign exists within a bounding box that fits the sample. Indicates the resistance loss coefficient. Represents the style loss coefficient. Indicates the content loss coefficient. Indicates the smoothing loss coefficient; The second preset loss is expressed as: ; in, Indicates the second preset loss. This represents the second adversarial loss corresponding to the second preset loss. Indicates the category of attack target Classification confidence, In addition to Other categories Classification confidence, This represents the total number of categories of adversarial attack samples.
9. A traffic sign adversarial example generation device for adversarial training, characterized in that, include: The device includes a processor, a storage medium, and a bus. The storage medium stores machine-readable instructions executable by the processor. When the traffic sign adversarial example generation device for adversarial training is running, the processor communicates with the storage medium via the bus, and the processor executes the machine-readable instructions to perform the steps of the traffic sign adversarial example generation method for adversarial training as described in any one of claims 1-8.