Adversarial example detection methods, devices, equipment, and storage media

By extracting and compressing image sample features and calculating the difference value to determine whether it is an adversarial example, the problem of unstable detection effect in the existing technology is solved, and high accuracy and high efficiency of adversarial example detection are achieved.

CN117745626BActive Publication Date: 2026-06-30QI-ANXIN LEGENDSEC INFORMATION TECH (BEIJING) INC +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
QI-ANXIN LEGENDSEC INFORMATION TECH (BEIJING) INC
Filing Date
2022-09-15
Publication Date
2026-06-30

Smart Images

  • Figure CN117745626B_ABST
    Figure CN117745626B_ABST
Patent Text Reader

Abstract

This invention provides a method, apparatus, device, and storage medium for adversarial example detection. The method includes: extracting sample features from a reference image sample and an image sample to be detected; compressing the image sample to be detected and extracting the compressed features of the compressed image sample; determining the difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference values ​​between the sample features of the reference image sample and the compressed features; and determining whether the image sample to be detected is an adversarial example based on the difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference values ​​between the sample features of the reference image sample and the compressed features. The above scheme achieves high accuracy in adversarial example detection.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of image security detection technology, and in particular to an adversarial sample detection method, apparatus, device, and storage medium. Background Technology

[0002] Adversarial attacks refer to the addition of subtle, imperceptible perturbations to the input image, which abnormally alter the depth features extracted by the comparison model, "deceiving" the model and causing incorrect recognition results. Samples with added adversarial perturbations that can "deceive" the model into outputting incorrect judgments are called adversarial samples.

[0003] Adversarial example detection involves analyzing an input image to determine if it contains adversarial examples. If it does, the image is rejected from the recognition model; otherwise, it is passed on normally. Conventional adversarial example detection methods train a detection model using deep features from both normal and adversarial examples to evaluate the image samples. The generation method and quality of the adversarial examples directly affect the training of the detection model, and thus the detection results. Adversarial example detection models trained using adversarial examples generated by methods with poor attack effectiveness have poor robustness and unstable detection performance when faced with samples generated by new adversarial example generation methods. Summary of the Invention

[0004] This invention provides an adversarial sample detection method, apparatus, device, and storage medium to solve the technical problem of unstable detection results.

[0005] Specifically, the embodiments of the present invention provide the following technical solutions:

[0006] In a first aspect, embodiments of the present invention provide an adversarial example detection method, comprising:

[0007] Extract sample features from reference image samples and image samples to be detected;

[0008] The image samples to be detected are compressed, and the compression features of the compressed image samples are extracted.

[0009] The difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference values ​​between the sample features of the reference image sample and the compressed features are determined respectively.

[0010] Based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference between the sample features of the reference image sample and the compressed features, it is determined whether the image sample to be detected is an adversarial sample.

[0011] Secondly, embodiments of the present invention provide an adversarial sample detection device, comprising:

[0012] The feature extraction module is used to extract sample features from the reference image sample and the image sample to be detected;

[0013] The feature extraction module is also used to compress the image sample to be detected and extract the compression features of the compressed image sample.

[0014] The processing module is used to determine the difference value between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference value between the sample features of the reference image sample and the compressed features, respectively.

[0015] The processing module is further configured to determine whether the image sample to be detected is an adversarial sample based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference between the sample features of the reference image sample and the compressed features.

[0016] Thirdly, embodiments of the present invention also provide an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of the adversarial sample detection method as described in the first aspect.

[0017] Fourthly, embodiments of the present invention also provide a non-transitory computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of the adversarial sample detection method as described in the first aspect.

[0018] Fifthly, embodiments of the present invention also provide a computer program product, including a computer program that, when executed by a processor, implements the steps of the adversarial sample detection method as described in the first aspect.

[0019] The adversarial sample detection method, apparatus, device, and storage medium provided in this invention extract sample features from a reference image sample and an image sample to be detected; compress the image sample to be detected to remove redundant features and extract the compressed features of the compressed image sample; further, determine the difference value between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference value between the sample features of the reference image sample and the compressed features; determine whether the image sample to be detected is an adversarial sample based on the difference value between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference value between the sample features of the reference image sample and the compressed features. By using the difference value between the compressed and uncompressed image samples and the reference image sample to determine whether it is an adversarial sample, the detection accuracy is high. Attached Figure Description

[0020] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0021] Figure 1 This is one of the flowcharts of the adversarial sample detection method provided in the embodiments of the present invention;

[0022] Figure 2 This is one of the image compression schematic diagrams of the adversarial sample detection method provided in the embodiments of the present invention;

[0023] Figure 3 This is the second schematic diagram of image compression for the adversarial sample detection method provided in this embodiment of the invention;

[0024] Figure 4 This is the third schematic diagram of image compression for the adversarial sample detection method provided in this embodiment of the invention;

[0025] Figure 5 This is a second schematic flowchart of the adversarial sample detection method provided in this embodiment of the invention;

[0026] Figure 6 This is an example diagram illustrating the implementation principle of the adversarial sample detection method provided in this embodiment of the invention;

[0027] Figure 7 This is a schematic diagram of the structure of the adversarial sample detection device provided by the present invention;

[0028] Figure 8 This is a schematic diagram of the structure of the electronic device provided by the present invention. Detailed Implementation

[0029] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of this invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this invention. All other embodiments obtained by those skilled in the art based on the embodiments of this invention without creative effort are within the scope of protection of this invention.

[0030] First, the relevant concepts involved in the embodiments of this invention will be explained:

[0031] Facial recognition technology mainly consists of two parts: face detection and face comparison. Face comparison is the primary part for obtaining facial identity information. First, deep feature extraction is performed on the input facial image. Then, the similarity between the input facial features and features in a face database is calculated and compared to obtain the corresponding identity information from the database. During this process, errors in the model's inference results will directly affect the final result of facial recognition.

[0032] Adversarial attacks on face comparison mainly fall into two categories: evasion attacks and spoofing attacks. Evasion attacks involve adding adversarial perturbations to the input samples, causing two face images of the same person to be identified as different individuals. Spoofing attacks, on the other hand, involve adding adversarial perturbations to the input samples, causing two face images of different individuals to be identified as the same person. Due to the existence of adversarial attacks, adversarial sample detection for face comparison is introduced. The advantage of performing adversarial sample detection analysis before inputting the face image samples to the face comparison model is that it eliminates the need to retrain the face comparison model, provides good defense for large-scale face comparison models, and has strong transferability.

[0033] The method of this invention can be applied to image samples, such as face image samples, and can also be applied to other image samples. This invention does not limit this application.

[0034] The following is combined Figures 1-6 The technical solutions of the embodiments of the present invention will be described in detail with reference to specific examples. The following specific examples can be combined with each other, and the same or similar concepts or processes may not be described again in some examples.

[0035] Figure 1 This is one of the flowcharts illustrating the adversarial example detection method provided in this embodiment of the invention. For example... Figure 1 As shown, the method provided in this embodiment includes:

[0036] Step 101: Extract sample features from the reference image sample and the image sample to be detected;

[0037] Specifically, determining whether an image sample to be detected is an adversarial sample can be based on the image features of the image sample. To make the results more accurate, a reference image sample can be added. First, the image features of the reference image sample and the image sample to be detected are extracted to obtain the sample features.

[0038] Step 102: Compress the image sample to be detected and extract the compression features of the compressed image sample;

[0039] Specifically, in order to make the results of adversarial sample detection more accurate and improve detection efficiency, the image samples to be detected can be compressed to reduce redundant image features, and the compressed features of the compressed image samples can be extracted.

[0040] Step 103: Determine the difference between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference between the sample features of the reference image sample and the compressed features.

[0041] For example, the sample features extracted based on the reference image sample are denoted as K1, the sample features extracted based on the image sample to be detected are denoted as K2, and the compressed features extracted after the image sample to be detected is first compressed are denoted as K3.

[0042] Calculate the differences between K1 and K2, and between K1 and K3 respectively.

[0043] Step 104: Determine whether the image sample to be detected is an adversarial sample based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected, as well as the difference between the sample features of the reference image sample and the compressed features.

[0044] Specifically, the method determines whether an image sample to be detected is an adversarial sample by analyzing the differences between features. For example, the maximum, minimum, range, or weighted average of the differences are compared with a preset threshold to determine whether it is an adversarial sample.

[0045] The method in this embodiment compresses the image sample to be detected to compress redundant features, and uses the differences between the features of the compressed and uncompressed image sample to be detected and the reference image sample to determine whether the image sample to be detected is an adversarial sample, with a high detection accuracy.

[0046] Optionally, the reference image sample and the image sample to be detected can be input into the feature extraction model respectively to obtain the sample features.

[0047] Optionally, the feature extraction model is a deep feature extraction model.

[0048] Optionally, the feature extraction model is trained using a machine learning model built on image samples, such as a neural network algorithm. Alternatively, the model can also be trained using the labels of the image samples, such as the acquired image features.

[0049] Optionally, the image samples to be detected are input into M compressors for compression, and the compression features of the M compressed image samples are extracted; M is an integer greater than 1; each of the M compressors uses a different compression method.

[0050] Optionally, the M compressors can be at least two of the following: bit compressors, median smoothing compressors, or nonlocal mean smoothing compressors.

[0051] Each compressor's parameters include at least one of the following: the number of bits in the bit compressor, the first window size of the median smoothing compressor, the second window size of the nonlocal mean smoothing compressor, and the number of similar windows.

[0052] Optionally, when the compressor is a bit compressor, for any pixel value of the image sample to be detected, the pixel value is bit compressed to obtain a compressed pixel value; the compressed pixel value is represented based on a second bit number, the second bit number is less than the first bit number, and any pixel value is represented based on the first bit number;

[0053] When the compressor is a median smoothing compressor, for any pixel value of the image sample to be detected, the median of the pixel values ​​in the first window containing the pixel values ​​is obtained, and the median is used as the compressed pixel value.

[0054] When the compressor is a nonlocal mean smoothing compressor, for any pixel value of the image sample to be detected, a second window containing the pixel value is obtained. P windows similar to the second window are obtained in the image sample to be detected. The weighted average of the pixel values ​​of the P windows similar to the second window is used as the compressed pixel value, where P is a positive integer.

[0055] Specifically, a bit compressor is a device that compresses the pixel values ​​of an image. A typical color image has pixel values ​​ranging from 0 to 255, represented by 8 binary bits. Bit compression refers to using fewer bits to approximate the pixel values ​​of an image. Optionally, pixel bit compression is calculated as follows:

[0056] output = round(input / 255.0x(2 i -1))

[0057] Where, input represents the input pixel value, output represents the output pixel value, i is the number of bits after compression, i is an integer value between [1,7], and round is the rounding function.

[0058] Median smoothing compressor refers to replacing each pixel in an image with the median value of all pixels within a first window centered on that pixel. It is a non-linear smoothing filtering algorithm. The method for calculating image median smoothing can be expressed as:

[0059]

[0060] Where S is the first window within the range of (i,j) centered on pixel (x,y), and the size of the first window is usually an odd value, f is the pixel value of the input image, and g(x,y) is the pixel value of pixel (x,y) after median smoothing.

[0061] The nonlocal mean smoothing compressor works by taking a pixel at any location in a given image, taking a second window of fixed size around the pixel, searching for windows similar to the second window in the image, averaging the pixel values ​​of the similar windows, and then replacing the pixel with the result.

[0062] Figures 2-4 The image shows the compression effects of using different compressors. Figure 2 The compression effect of the bit compressor at different bit depths shows that when the bit depth is less than 4, there is obvious image distortion, while when the bit depth is greater than or equal to 4, there is no significant impact on the image display. Figure 3 Compression performance of the median smooth compressor at different sizes of the first window; Figure 4 For a nonlocal mean smoothing compressor, the parameters are the size of the second window (e.g., (3,3) and (7,7)) and the number of similar windows 21.

[0063] In the above embodiments, the image samples to be detected are compressed using various compression methods, and then the compression features of the compressed image samples are extracted. The compression reduces redundant features, making the final result more accurate and efficient.

[0064] Alternatively, step 103 can be implemented in the following way:

[0065] The Euclidean distance between the sample features of the reference image sample and the sample features of the image sample to be detected, and the Euclidean distance between the sample features of the reference image sample and the compressed features are calculated respectively, and the Euclidean distance is determined as the difference value.

[0066] Specifically, the difference can be calculated by measuring the Euclidean distance between features, using the following formula:

[0067]

[0068] Where x and y are the feature vectors of the sample features or compressed features, and n is the vector dimension, for example, a dimension of 512, x i Let be the value of the i-th component of the characteristic vector.

[0069] In the above implementation, the complexity is low by using the Euclidean distance between features as the difference value.

[0070] Optionally, the method further includes:

[0071] The adversarial sample determination threshold is determined based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected.

[0072] Specifically, adversarial attack types can be divided into two types: evasion attack type and impersonation attack type. When determining whether different types of image samples are adversarial samples, different adversarial sample determination thresholds can be used.

[0073] The threshold for adversarial example determination can be determined based on the differences between sample features.

[0074] Optionally, if the difference value is greater than the target determination threshold, the adversarial example determination threshold is determined to be the adversarial example determination threshold corresponding to the evasion attack type; the target determination threshold is used to determine whether the image samples being compared are image samples of the same object.

[0075] If the difference value is not greater than the target judgment threshold, then the adversarial sample judgment threshold is determined to be the adversarial sample judgment threshold corresponding to the spoofing attack type.

[0076] Specifically, the same object can be the same person, the same item, or items with the same appearance. Assuming the same object is the same person, if the difference value is greater than the threshold used to determine whether the image sample to be detected and the reference image sample belong to the same person, then they are considered not to be the same person. If the image sample to be detected is an adversarial example, then the attack type can only be an evasion attack, and in this case, the adversarial example determination threshold needs to be sampled from the adversarial example determination threshold corresponding to the evasion attack type. If the difference value is not greater than this determination threshold, then they are considered to be the same person. If the image sample to be detected is an adversarial example, then the attack type can only be an impersonation attack, and in this case, the adversarial example determination threshold needs to be sampled from the adversarial example determination threshold corresponding to the impersonation attack type.

[0077] In the above implementation, when determining whether a sample is a comparison sample, it is necessary to first determine which attack type the judgment threshold is. While identifying whether the image sample to be detected is an adversarial sample, the attack type can also be determined. That is, if an adversarial sample is identified, the attack type can be determined at the same time, which can help determine the attacker's attack intent.

[0078] Alternatively, step 104 can be implemented in the following way:

[0079] The difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected are determined, as well as the difference values ​​between the sample features of the reference image sample and each of the M compressed features, to obtain M+1 difference values.

[0080] Based on the range of M+1 difference values ​​and the adversarial example determination threshold, it is determined whether the image sample to be detected is an adversarial example.

[0081] Specifically, the range is calculated for the obtained M+1 difference values. The method for calculating the range is as follows:

[0082] R = max(x) - min(x)

[0083] x is the set of numerical values ​​with differences, that is, the range is the difference between the maximum and minimum values ​​in the set of numerical values.

[0084] Furthermore, the adversarial sample determination thresholds corresponding to different attack types are obtained in advance. These adversarial sample determination thresholds can be obtained through training, for example, different adversarial sample determination thresholds correspond to different attack types.

[0085] If the range value is greater than the adversarial sample determination threshold, it is determined to be an adversarial sample; otherwise, it is a normal sample.

[0086] Optionally, if the adversarial sample determination threshold is the same as the adversarial sample determination threshold corresponding to the evasion attack type, and the range of the M+1 difference values ​​is greater than the adversarial sample determination threshold corresponding to the evasion attack type, then the image sample to be detected is determined to be an adversarial sample of the evasion attack type; if the range of the M+1 difference values ​​is not greater than the adversarial sample determination threshold corresponding to the evasion attack type, then the image sample to be detected is determined to be a normal sample.

[0087] If the adversarial sample determination threshold is the same as the adversarial sample determination threshold corresponding to the spoofing attack type, and the range of the M+1 difference values ​​is greater than the adversarial sample determination threshold corresponding to the spoofing attack type, then the image sample to be detected is determined to be an adversarial sample of the spoofing attack type; if the range of the M+1 difference values ​​is not greater than the adversarial sample determination threshold corresponding to the spoofing attack type, then the image sample to be detected is determined to be a normal sample.

[0088] In the above implementation, the range of M+1 difference values ​​and the adversarial sample determination threshold are used to determine whether the image sample to be detected is an adversarial sample, making the adversarial sample determination result more accurate than using the maximum value, minimum value, mean value, etc.

[0089] Optionally, a first model is used to determine whether the image sample to be detected is an adversarial sample. The first model includes M compressors and a feature extraction model. The feature extraction model is used to extract sample features from the reference image sample and the image sample to be detected. Before determining whether the image sample to be detected is an adversarial sample, the model further includes:

[0090] For any normal sample in the normal sample set, the first model is used to obtain the range of M+1 difference values ​​corresponding to the normal sample.

[0091] The adversarial sample determination threshold is determined based on the range value corresponding to each normal sample in the normal sample set; the adversarial sample determination threshold includes the adversarial sample determination threshold corresponding to evasion attack and the adversarial sample determination threshold corresponding to impersonation attack.

[0092] Using the first model based on the adversarial sample determination threshold, adversarial sample detection is performed on each adversarial sample in the adversarial sample set to obtain the detection results;

[0093] Based on the detection results and the actual results of each adversarial sample annotation in the adversarial sample set, the parameters of M compressors are adjusted to optimize the first model until the detection results of the first model meet the preset conditions.

[0094] Specifically, this invention utilizes a normal sample set and an adversarial sample set built on the LFW (Labeled Faces in the Wild) dataset to construct a first model. The first model includes M compressors and a feature extraction model. Furthermore, the method of this invention is applicable to other face comparison datasets.

[0095] Optionally, this embodiment uses face images as an example. For instance, the LFW dataset contains 6000 pairs of face images, of which 3000 pairs belong to the same person, and the remaining 3000 pairs belong to different people. Separating the face image pairs of the same person from those of different people creates two normal sample sets. The dataset of face image pairs of the same person is used to determine the adversarial example determination threshold corresponding to evasion attacks, while the dataset of image pairs of different people is used to determine the adversarial example determination threshold corresponding to spoofing attacks.

[0096] Optionally, the detection performance is tested using adversarial sample sets to adjust the parameters of the compressor in the first model. Two adversarial sample sets are constructed, each randomly selecting m images (m is, for example, 300) from normal sample sets of the same person and different people for evasion and spoofing attacks. The attack methods include various face adversarial sample generation methods, including at least two of the following: momentum-based basic iterative method MIM, basic iterative method BIM, TIM, CIM, and LGC. Each adversarial sample set can contain 1500 pairs of face image adversarial samples.

[0097] Optionally, when the attack type is evasion attack, the normal sample set is the image sample set of the same object, and the adversarial sample set is the evasion attack sample set;

[0098] In the case of a spoofing attack, the normal sample set consists of image samples of different objects, while the adversarial sample set consists of spoofing attack sample sets.

[0099] Optionally, when the attack type is an evasion attack, the normal sample set is a set of facial image samples of the same person, and the adversarial sample set is an evasion attack sample set.

[0100] In the case of an attack type of spoofing, the normal sample set is a set of face images of different people, and the adversarial sample set is a set of spoofing attack samples.

[0101] Taking evasion attacks as an example (the same principle applies to impersonation attacks, with the same implementation process), the normal sample set is selected from the same person's sample set, while the adversarial sample set is selected from the evasion attack sample set. Figure 5 The process of building an optimized flowchart for the first model mainly includes the following steps:

[0102] (1) Determine the parameters of the compressor in the first model, including the number of bits of the bit compressor, the size of the first window of median smoothing, the size of the second window of non-local mean smoothing, and the number of similar windows; optionally, the initial parameters of the compressor can be preset.

[0103] (2) Input all normal samples in the normal sample set into the first model, and obtain multiple range values. Determine the adversarial sample judgment threshold based on the multiple range values.

[0104] Optionally, the range values ​​corresponding to each normal sample in the normal sample set are sorted, and the Nth range value is used as the threshold for adversarial sample determination, where N is the number of samples in the normal sample set multiplied by a preset ratio.

[0105] For example, N can be determined in ascending or descending order. Assuming ascending order, the preset percentage is 95%; assuming descending order, the preset percentage is 5%. The sorting can be ascending or descending.

[0106] (3) Using the adversarial sample determination threshold obtained in (2), it can be determined whether it is an adversarial sample. The adversarial sample set is used to detect adversarial samples, that is, to determine whether the sample in the adversarial sample set is an adversarial sample. If it is determined to be an adversarial sample, it means that the performance of the first model is better.

[0107] Optionally, the threshold for judging adversarial examples for evasion attacks is different from the threshold for judging adversarial examples for spoofing attacks.

[0108] (4) Determine whether the detection result of (3) has achieved the best detection effect, that is, whether the preset conditions are met. The preset conditions can be characterized by the detection accuracy, etc. If the corresponding detection accuracy is achieved, it is considered that the preset conditions are met and the best detection effect is achieved. Then the parameter optimization can be stopped and the final first model can be obtained; otherwise, the compressor parameters are adjusted and (2) and (3) are repeated.

[0109] Preset conditions could be, for example, that all samples in the adversarial sample set that exceed a preset threshold are detected as adversarial samples, such as a preset threshold of 98%.

[0110] In the above embodiments, the adversarial sample determination threshold is determined based on the range of the difference values ​​between the features of the image sample to be detected and the reference image sample, thereby achieving the purpose of determining whether it is an adversarial sample and identifying the attack type.

[0111] like Figure 6 As shown in the figure, threshold 1 represents the threshold for determining whether the compared image samples belong to the same person, threshold 2 represents the threshold for determining whether it is an adversarial sample for evading attacks, and threshold 3 represents the threshold for determining whether it is an adversarial sample for impersonation attacks. D represents the Euclidean distance. Figure 6 Three compressors are used: a bit compressor, a median smoothing compressor, and a nonlocal mean smoothing compressor. Figure 6 Taking face image samples as an example, the reference face image sample and the face image sample to be detected are input into the feature extraction model to output sample features. The face image sample to be detected is compressed by different compressors and then input into the feature extraction model to output compressed features. The Euclidean distance between the sample features of the reference face image sample and the face image sample to be detected, as well as the Euclidean distance between the sample features of the reference face image sample and the compressed features, are calculated, and then the range is calculated. It is determined whether the Euclidean distance between the sample features of the reference face image sample and the face image sample to be detected is greater than threshold 1. If it is greater than threshold 1, the judgment threshold is set to threshold 2; otherwise, it is set to threshold 3. Further, the range value is compared with threshold 2 or threshold 3. If it is greater than the adversarial sample judgment threshold, it is an adversarial sample; otherwise, it is a normal sample.

[0112] Table 1 Examples of Detection Results for Adversarial Samples of Evasion Attacks

[0113]

[0114]

[0115] Table 2 Examples of Detection Results for Impersonation Attack Adversarial Samples

[0116]

[0117] In summary, (1) compared with conventional methods that require adversarial examples for model training, the method of this embodiment only uses normal samples to construct the first model (i.e., the compressor), and adversarial examples are only used to detect model performance, so that the model itself has a certain degree of defense against adversarial examples generated by unknown attack methods, and is more robust. (For example, it can support the detection of adversarial examples generated by the aforementioned adversarial example generation methods, and supports more types of adversarial example generation methods and has higher accuracy than the method that adds adversarial example training.)

[0118] (2) Compared with the method of using a large number of adversarial samples to train the model, the method of this embodiment of the invention only uses a small number of adversarial samples to perform performance testing on the first model. It does not need to rely on a large number of adversarial samples to build the first model, saving the high time cost of generating adversarial samples and greatly improving the model construction efficiency.

[0119] (3) Compared with methods that require changes to the model structure and parameters and model retraining, the present invention does not require changes to the feature extraction model in the first model, but only requires adjustment of the compressor parameters, which is friendly to neural network models with complex structures and large parameter scales.

[0120] The adversarial sample detection device provided by the present invention is described below. The adversarial sample detection device described below can be referred to in correspondence with the adversarial sample detection method described above.

[0121] Figure 7 This is a schematic diagram of the adversarial sample detection device provided by the present invention. Figure 7 As shown, the adversarial sample detection device provided in this embodiment includes:

[0122] The feature extraction module 210 is used to extract sample features from the reference image sample and the image sample to be detected;

[0123] The feature extraction module 210 is also used to compress the image sample to be detected and extract the compression features of the compressed image sample.

[0124] The processing module 220 is used to determine the difference value between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference value between the sample features of the reference image sample and the compressed features, respectively.

[0125] The processing module 220 is further configured to determine whether the image sample to be detected is an adversarial sample based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference between the sample features of the reference image sample and the compressed features.

[0126] Optionally, the feature extraction module 210 is further configured to:

[0127] The image samples to be detected are input into M compressors for compression, and the compression features of the M compressed image samples are extracted; M is an integer greater than 1; each of the M compressors uses a different compression method.

[0128] Optionally, the processing module 220 is further configured to:

[0129] The adversarial sample determination threshold is determined based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected.

[0130] Optionally, the processing module 220 is specifically used for:

[0131] If the difference value is greater than the target determination threshold, then the adversarial sample determination threshold is determined to be the adversarial sample determination threshold corresponding to the evasion attack type; the target determination threshold is used to determine whether the image samples being compared are image samples of the same object;

[0132] If the difference value is not greater than the target determination threshold, then the adversarial sample determination threshold is determined to be the adversarial sample determination threshold corresponding to the spoofing attack type.

[0133] Optionally, the processing module 220 is specifically used for:

[0134] The difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected are determined, and the difference values ​​between the sample features of the reference image sample and each of the M compressed features are determined, resulting in M+1 difference values.

[0135] Based on the range of the M+1 difference values ​​and the adversarial sample determination threshold, it is determined whether the image sample to be detected is an adversarial sample.

[0136] Optionally, the processing module 220 is specifically used for:

[0137] If the adversarial sample determination threshold is the same as the adversarial sample determination threshold corresponding to the evasion attack type, and the range of the M+1 difference values ​​is greater than the adversarial sample determination threshold corresponding to the evasion attack type, then the image sample to be detected is determined to be an adversarial sample of the evasion attack type; if the range of the M+1 difference values ​​is not greater than the adversarial sample determination threshold corresponding to the evasion attack type, then the image sample to be detected is determined to be a normal sample.

[0138] If the adversarial sample determination threshold is the same as the adversarial sample determination threshold corresponding to the spoofing attack type, and the range of the M+1 difference values ​​is greater than the adversarial sample determination threshold corresponding to the spoofing attack type, then the image sample to be detected is determined to be an adversarial sample of the spoofing attack type; if the range of the M+1 difference values ​​is not greater than the adversarial sample determination threshold corresponding to the spoofing attack type, then the image sample to be detected is determined to be a normal sample.

[0139] Optionally, the processing module 220 is specifically used for:

[0140] The Euclidean distance between the sample features of the reference image sample and the sample features of the image sample to be detected, and the Euclidean distance between the sample features of the reference image sample and the compressed features are calculated respectively, and the Euclidean distance is determined as the difference value.

[0141] Optionally, the processing module 220 is specifically used for:

[0142] The first model determines whether the image sample to be detected is an adversarial sample. The first model includes the M compressors and a feature extraction model. The feature extraction model is used to extract sample features of the reference image sample and the image sample to be detected.

[0143] The processing module 220 is further configured to:

[0144] For any normal sample in the normal sample set, the first model is used to obtain the range of M+1 difference values ​​corresponding to the normal sample.

[0145] The adversarial sample determination threshold is determined based on the range value corresponding to each normal sample in the normal sample set; the adversarial sample determination threshold is the adversarial sample determination threshold corresponding to the evasion attack type or the adversarial sample determination threshold corresponding to the spoofing attack type.

[0146] Using the first model based on the adversarial sample determination threshold, adversarial sample detection is performed on each adversarial sample in the adversarial sample set to obtain the detection results;

[0147] Based on the detection results and the true results of each adversarial sample annotation in the adversarial sample set, the parameters of the M compressors are adjusted to optimize the first model until the detection results of the first model meet the preset conditions.

[0148] Optionally, the processing module 220 is specifically used for:

[0149] The range values ​​corresponding to each normal sample in the normal sample set are sorted, and the Nth range value is used as the adversarial sample determination threshold, where N is the number of samples in the normal sample set multiplied by a preset ratio.

[0150] Optionally, when the attack type is an evasion attack type, the normal sample set is an image sample set of the same object, and the adversarial sample set is an evasion attack sample set;

[0151] In the case of an attack type of spoofing, the normal sample set is a set of image samples of different objects, and the adversarial sample set is a set of spoofing attack samples.

[0152] Optionally, the M compressors include at least two of the following: a bit compressor, a median smoothing compressor, and a nonlocal mean smoothing compressor, and the parameters of each compressor include at least one of the following: the number of bits of the bit compressor, the size of the first window of the median smoothing compressor, the size of the second window of the nonlocal mean smoothing compressor, and the number of similar windows.

[0153] Optionally, the feature extraction module 210 is specifically used for:

[0154] When the compressor is a bit compressor, for any pixel value of the image sample to be detected, the pixel value is bit compressed to obtain a compressed pixel value; the compressed pixel value is represented based on a second number of bits, where the second number of bits is less than the first number, and any pixel value is represented based on the first number of bits;

[0155] When the compressor is a median smoothing compressor, for any pixel value of the image sample to be detected, the median of the pixel values ​​within a first window containing the pixel value is obtained, and the median is used as the compressed pixel value;

[0156] When the compressor is a nonlocal mean smoothing compressor, for any pixel value of the image sample to be detected, a second window containing the pixel value is obtained, P windows similar to the second window are obtained in the image sample to be detected, and the weighted average of the pixel values ​​of the P windows similar to the second window is used as the compressed pixel value, where P is a positive integer.

[0157] Optionally, both the reference image sample and the image sample to be detected are face image samples.

[0158] The apparatus of this embodiment can be used to execute the method of any of the foregoing method embodiments. Its specific implementation process and technical effects are the same as those in the method embodiments. For details, please refer to the detailed description in the method embodiments, which will not be repeated here.

[0159] Figure 8An example is a schematic diagram of the physical structure of an electronic device, such as... Figure 8 As shown, the electronic device may include: a processor 810, a communications interface 820, a memory 830, and a communication bus 840, wherein the processor 810, the communications interface 820, and the memory 830 communicate with each other via the communication bus 840. The processor 810 can call logical instructions in the memory 830 to execute an adversarial example detection method, which includes:

[0160] Extract sample features from reference image samples and image samples to be detected;

[0161] The image samples to be detected are compressed, and the compression features of the compressed image samples are extracted.

[0162] The difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference values ​​between the sample features of the reference image sample and the compressed features are determined respectively.

[0163] Based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference between the sample features of the reference image sample and the compressed features, it is determined whether the image sample to be detected is an adversarial sample.

[0164] Furthermore, the logical instructions in the aforementioned memory 830 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, essentially, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0165] On the other hand, the present invention also provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions, wherein when the program instructions are executed by a computer, the computer is able to execute the adversarial example detection method provided by the above methods, the method comprising:

[0166] Extract sample features from reference image samples and image samples to be detected;

[0167] The image samples to be detected are compressed, and the compression features of the compressed image samples are extracted.

[0168] The difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference values ​​between the sample features of the reference image sample and the compressed features are determined respectively.

[0169] Based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference between the sample features of the reference image sample and the compressed features, it is determined whether the image sample to be detected is an adversarial sample.

[0170] In another aspect, the present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, is implemented to perform the aforementioned adversarial example detection methods, the method comprising:

[0171] Extract sample features from reference image samples and image samples to be detected;

[0172] The image samples to be detected are compressed, and the compression features of the compressed image samples are extracted.

[0173] The difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference values ​​between the sample features of the reference image sample and the compressed features are determined respectively.

[0174] Based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference between the sample features of the reference image sample and the compressed features, it is determined whether the image sample to be detected is an adversarial sample.

[0175] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without any creative effort.

[0176] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus necessary general-purpose hardware platforms, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions, in essence or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments or some parts of the embodiments.

[0177] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for detecting an adversarial sample, characterized in that, include: Extract sample features from reference image samples and image samples to be detected; The image samples to be detected are compressed, and the compression features of the compressed image samples are extracted. The difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference values ​​between the sample features of the reference image sample and the compressed features are determined respectively. Based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference between the sample features of the reference image sample and the compressed features, it is determined whether the image sample to be detected is an adversarial sample. The step of compressing the image sample to be detected and extracting the compression features of the compressed image sample includes: The image samples to be detected are input into M compressors for compression, and the compression features of the M compressed image samples are extracted; M is an integer greater than 1; each of the M compressors uses a different compression method; The method further includes: If the difference between the sample features of the reference image sample and the sample features of the image sample to be detected is greater than the target determination threshold, then the adversarial sample determination threshold is determined to be the adversarial sample determination threshold corresponding to the evasion attack type; the target determination threshold is used to determine whether the image samples being compared are image samples of the same object; If the difference between the sample features of the reference image sample and the sample features of the image sample to be detected is not greater than the target determination threshold, then the adversarial sample determination threshold is determined to be the adversarial sample determination threshold corresponding to the spoofing attack type. The step of determining whether the image sample to be detected is an adversarial sample based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference between the sample features of the reference image sample and the compressed features, includes: The difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected are determined, and the difference values ​​between the sample features of the reference image sample and each of the M compressed features are determined, resulting in M+1 difference values. Based on the range of the M+1 difference values ​​and the adversarial sample determination threshold, it is determined whether the image sample to be detected is an adversarial sample.

2. The method of claim 1, wherein, The step of determining whether the image sample to be detected is an adversarial sample based on the range of the M+1 difference values ​​and the adversarial sample determination threshold includes: If the adversarial sample determination threshold is the same as the adversarial sample determination threshold corresponding to the evasion attack type, and the range of the M+1 difference values ​​is greater than the adversarial sample determination threshold corresponding to the evasion attack type, then the image sample to be detected is determined to be an adversarial sample of the evasion attack type; if the range of the M+1 difference values ​​is not greater than the adversarial sample determination threshold corresponding to the evasion attack type, then the image sample to be detected is determined to be a normal sample. If the adversarial sample determination threshold is the same as the adversarial sample determination threshold corresponding to the spoofing attack type, and the range of the M+1 difference values ​​is greater than the adversarial sample determination threshold corresponding to the spoofing attack type, then the image sample to be detected is determined to be an adversarial sample of the spoofing attack type; if the range of the M+1 difference values ​​is not greater than the adversarial sample determination threshold corresponding to the spoofing attack type, then the image sample to be detected is determined to be a normal sample.

3. The adversarial sample detection method of claim 1, wherein, The step of determining the difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference values ​​between the sample features of the reference image sample and the compressed features, includes: The Euclidean distance between the sample features of the reference image sample and the sample features of the image sample to be detected, and the Euclidean distance between the sample features of the reference image sample and the compressed features are calculated respectively, and the Euclidean distance is determined as the difference value.

4. The adversarial sample detection method of claim 1, wherein, The first model determines whether the image sample to be detected is an adversarial sample. The first model includes the M compressors and a feature extraction model. The feature extraction model is used to extract sample features from the reference image sample and the image sample to be detected. Before determining whether the image sample to be detected is an adversarial sample, the model further includes: For any normal sample in the normal sample set, the first model is used to obtain the range of M+1 difference values ​​corresponding to the normal sample, wherein the normal sample includes image pairs; The adversarial sample determination threshold is determined based on the range value corresponding to each normal sample in the normal sample set; the adversarial sample determination threshold is the adversarial sample determination threshold corresponding to the evasion attack type or the adversarial sample determination threshold corresponding to the spoofing attack type. Using the first model based on the adversarial sample determination threshold, adversarial sample detection is performed on each adversarial sample in the adversarial sample set to obtain the detection results; Based on the detection results and the true results of each adversarial sample annotation in the adversarial sample set, the parameters of the M compressors are adjusted to optimize the first model until the detection results of the first model meet the preset conditions.

5. The adversarial example detection method according to claim 4, characterized in that, The step of determining the adversarial example determination threshold based on the range value corresponding to each normal sample in the normal sample set includes: The range values ​​corresponding to each normal sample in the normal sample set are sorted, and the Nth range value is used as the adversarial sample determination threshold, where N is the number of samples in the normal sample set multiplied by a preset ratio.

6. The adversarial example detection method according to claim 4, characterized in that, When the attack type is an evasion attack type, the normal sample set is an image sample set of the same object, and the adversarial sample set is an evasion attack sample set; In the case of an attack type of spoofing, the normal sample set is a set of image samples of different objects, and the adversarial sample set is a set of spoofing attack samples.

7. The adversarial example detection method according to claim 1, characterized in that, The M compressors include at least two of the following: a bit compressor, a median smoothing compressor, and a nonlocal mean smoothing compressor. The parameters of the compressor include: the number of bits of the bit compressor, the first window size of the median smoothing compressor, the second window size of the nonlocal mean smoothing compressor, and the number of similar windows.

8. The adversarial example detection method according to claim 1, characterized in that, The step of compressing the image samples to be detected using M compressors includes: When the compressor is a bit compressor, for any pixel value of the image sample to be detected, the pixel value is bit compressed to obtain a compressed pixel value; the compressed pixel value is represented based on a second number of bits, where the second number of bits is less than the first number, and any pixel value is represented based on the first number of bits; When the compressor is a median smoothing compressor, for any pixel value of the image sample to be detected, the median of the pixel values ​​within a first window containing the pixel value is obtained, and the median is used as the compressed pixel value; When the compressor is a nonlocal mean smoothing compressor, for any pixel value of the image sample to be detected, a second window containing the pixel value is obtained, P windows similar to the second window are obtained in the image sample to be detected, and the weighted average of the pixel values ​​of the P windows similar to the second window is used as the compressed pixel value, where P is a positive integer.

9. The adversarial example detection method according to claim 1, characterized in that, Both the reference image sample and the image sample to be detected are face image samples.

10. An adversarial sample detection device, characterized in that, include: The feature extraction module is used to extract sample features from the reference image sample and the image sample to be detected; The feature extraction module is also used to compress the image sample to be detected and extract the compression features of the compressed image sample. The processing module is used to determine the difference value between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference value between the sample features of the reference image sample and the compressed features, respectively. The processing module is further configured to determine whether the image sample to be detected is an adversarial sample based on the difference between the sample features of the reference image sample and the sample features of the image sample to be detected, and the difference between the sample features of the reference image sample and the compressed features. The feature extraction module is specifically used for: The image samples to be detected are input into M compressors for compression, and the compression features of the M compressed image samples are extracted; M is an integer greater than 1; each of the M compressors uses a different compression method; The processing module is specifically used for: If the difference between the sample features of the reference image sample and the sample features of the image sample to be detected is greater than the target determination threshold, then the adversarial sample determination threshold is determined to be the adversarial sample determination threshold corresponding to the evasion attack type; the target determination threshold is used to determine whether the image samples being compared are image samples of the same object; If the difference between the sample features of the reference image sample and the sample features of the image sample to be detected is not greater than the target determination threshold, then the adversarial sample determination threshold is determined to be the adversarial sample determination threshold corresponding to the spoofing attack type. The difference values ​​between the sample features of the reference image sample and the sample features of the image sample to be detected are determined, and the difference values ​​between the sample features of the reference image sample and each of the M compressed features are determined, resulting in M+1 difference values. Based on the range of the M+1 difference values ​​and the adversarial sample determination threshold, it is determined whether the image sample to be detected is an adversarial sample.

11. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the adversarial sample detection method as described in any one of claims 1 to 9.

12. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When executed by a processor, the computer program implements the adversarial sample detection method as described in any one of claims 1 to 9.

13. A computer program product having executable instructions stored thereon, characterized in that, When executed by the processor, this instruction causes the processor to implement the adversarial sample detection method as described in any one of claims 1 to 9.