Face adversarial sample generation method, related device, equipment and storage medium
By generating and updating interference curves, the concealment of adversarial examples on faces is increased, which solves the problem of poor concealment in existing technologies, improves the defense capability of face recognition systems, and ensures system security.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING REALAI TECH CO LTD
- Filing Date
- 2023-09-27
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, adversarial examples in the physical world are poorly concealed and easily identified by defensive measures, which weakens the effectiveness of adversarial examples and makes it difficult to effectively defend against attacks from facial recognition systems.
By acquiring the interference curve generator, an initial interference curve is generated and added to the initial face image. The interference curve generator is then updated using the loss value to generate a more covert target interference curve, thereby increasing the covertness and attack effect of the face adversarial example.
It improves the facial recognition system's ability to defend against physical-world attacks, detects and fixes system vulnerabilities, and ensures the security of the facial recognition system.
Smart Images

Figure CN117315395B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of face recognition technology, and more specifically to a method, related apparatus, device and storage medium for generating adversarial examples of faces. Background Technology
[0002] With the widespread application of AI products in facial recognition, facial security has become a critical issue concerning public privacy and property safety. To ensure the security of facial recognition systems, adversarial attacks are often employed to identify and promptly patch vulnerabilities. Among existing adversarial attacks, physical-world adversarial attacks are the most difficult to defend against and pose the greatest threat. Physical-world adversarial attacks involve superimposing adversarial perturbations onto a person's face in the physical world, such as using masks, hats, glasses, or wigs to obscure or alter facial features, causing the facial recognition system to output incorrect results.
[0003] In existing technologies, to prevent physical-world adversarial attacks, adversarial masks, hats, and glasses are commonly used to create adversarial examples of faces for use against facial recognition systems. However, these adversarial examples have clear boundaries and obvious adversarial features, resulting in poor concealment and making them easy targets for corresponding defensive measures, thus weakening the effectiveness of the adversarial examples. Summary of the Invention
[0004] This application provides a method, related apparatus, device, and storage medium for generating adversarial examples of faces. By updating the interference curve generator, the concealment of the interference curve superimposed on the face can be increased, thereby improving the attack effect of the generated adversarial examples of faces in the physical world and ensuring the security of the face recognition system.
[0005] In a first aspect, embodiments of this application provide a method for generating adversarial examples of faces. The method includes: acquiring an interference curve generator, an initial face image, and a reference face image; generating an initial interference curve using the interference curve generator; adding interference to the initial face image using the initial interference curve to obtain an adversarial face image; updating the interference curve generator based on the loss value between the reference face image and the adversarial face image to obtain an updated interference curve generator; and generating a target interference curve using the updated interference curve generator to obtain physical-world adversarial examples of faces from the target interference curve.
[0006] Secondly, embodiments of this application provide an apparatus for generating adversarial examples of faces, which has the function of implementing the method for generating adversarial examples of faces corresponding to the first aspect described above. The function can be implemented in hardware or by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above function, and the modules can be software and / or hardware.
[0007] In one embodiment, the face adversarial example generation apparatus includes: an input / output module configured to acquire an interference curve generator, an initial face image, and a reference face image; and a processing module configured to: generate an initial interference curve from the interference curve generator; add interference to the initial face image using the initial interference curve to obtain an adversarial face image; update the interference curve generator based on the loss value between the reference face image and the adversarial face image to obtain an updated interference curve generator; and generate a target interference curve from the updated interference curve generator to obtain physical-world face adversarial examples from the target interference curve.
[0008] Thirdly, embodiments of this application provide a computer-readable storage medium including instructions that, when executed on a computer, cause the computer to perform the face adversarial example generation method as described in the first aspect.
[0009] Fourthly, embodiments of this application provide a computing device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the face adversarial example generation method described in the first aspect.
[0010] Fifthly, embodiments of this application provide a chip that includes a processor coupled to a transceiver of a terminal device, for executing the technical solution provided in the first aspect of embodiments of this application.
[0011] In a sixth aspect, embodiments of this application provide a chip system including a processor for supporting a terminal device in implementing the functions involved in the first aspect above, such as generating or processing information involved in the method for generating adversarial samples of faces provided in the first aspect above.
[0012] In one possible design, the aforementioned chip system also includes a memory for storing program instructions and data necessary for the terminal. The chip system can be composed of chips or may include chips and other discrete components.
[0013] In a seventh aspect, embodiments of this application provide a computer program product including instructions, the computer program product including program instructions, which, when run on a computer or processor, cause the computer to execute the face adversarial sample generation method provided in the first aspect.
[0014] Compared to existing technologies, in this embodiment, an interference curve generator, an initial face image, and a reference face image are acquired; an initial interference curve is generated by the interference curve generator; interference is added to the initial face image using the initial interference curve to obtain an adversarial face image; the interference curve generator is updated based on the loss value between the reference face image and the adversarial face image to obtain an updated interference curve generator; a target interference curve is generated by the updated interference curve generator to obtain physical world adversarial face samples.
[0015] Therefore, this application embodiment obtains a reference face image as the face image to be attacked. An initial interference curve generated by an interference curve generator is used to add interference to the initial face image, making the generated adversarial face image more similar to the potentially attacked image in the face recognition system, thus possessing higher concealment. Then, by calculating the loss value between the interfered adversarial face image and the reference face image, the loss value is backpropagated to the interference curve generator, gradually updating the parameters of the interference curve generator. This causes the interference curve to grow in the direction of decreasing loss function, enabling the updated interference curve generator to generate an interference curve that better meets expectations. This further increases the concealment of the target interference curve superimposed on the face by the updated interference curve generator, thereby improving the attack effect of the generated physical world face adversarial sample. This physical world face adversarial sample can be used to launch adversarial attacks against the face recognition system, detect vulnerabilities in the face recognition system, and promptly fix these vulnerabilities, thereby improving the face recognition system's ability to prevent physical world adversarial attacks and ensuring the security of the face recognition system. Attached Figure Description
[0016] The objectives, features, and advantages of the embodiments of this application will become readily understood by referring to the accompanying drawings and the detailed description of the embodiments. Wherein:
[0017] Figure 1 This is a schematic diagram of a face adversarial example generation system, which is a face adversarial example generation method in the embodiments of this application.
[0018] Figure 2 This is a schematic flowchart illustrating a method for generating adversarial examples of faces according to an embodiment of this application.
[0019] Figure 3This is a schematic diagram illustrating the process of generating adversarial face examples according to an embodiment of this application, whereby an interfering image is superimposed on an initial face image to obtain an adversarial face image.
[0020] Figure 4 This is a schematic diagram of a linear three-dimensional Bézier curve mask for the face adversarial example generation method of this application embodiment;
[0021] Figure 5 This is a schematic diagram of physical-world adversarial face examples for the face adversarial example generation method of this application embodiment;
[0022] Figure 6 This is a schematic diagram of the workflow of the face adversarial sample generation system, which is the face adversarial sample generation method of this application embodiment;
[0023] Figure 7 This is a schematic diagram of the structure of the face adversarial example generation device according to an embodiment of this application;
[0024] Figure 8 This is a schematic diagram of the structure of a computing device according to an embodiment of this application;
[0025] Figure 9 This is a schematic diagram of the structure of a mobile phone in one embodiment of this application;
[0026] Figure 10 This is a schematic diagram of a server structure in one embodiment of this application.
[0027] In the accompanying drawings, the same or corresponding reference numerals indicate the same or corresponding parts. Detailed Implementation
[0028] The terms "first," "second," etc., in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects (e.g., first xx and second xx represent different xx, and so on), and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments described herein can be implemented in an order other than that illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or modules is not necessarily limited to those explicitly listed, but may include other steps or modules not explicitly listed or inherent to these processes, methods, products, or devices. The division of modules in the embodiments of this application is merely a logical division; in actual applications, there may be other division methods. For example, multiple modules may be combined or integrated into another system, or some features may be ignored or not performed. Additionally, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interface, indirect coupling between modules, or electrical or other similar forms of communication connection, none of which are limited in the embodiments of this application. Furthermore, the modules or sub-modules described as separate components may or may not be physically separated, may or may not be physical modules, or may be distributed among multiple circuit modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the embodiments of this application.
[0029] This application also provides a method, related apparatus, device, and storage medium for generating adversarial examples of faces, applicable to a system for generating adversarial examples of faces in adversarial attack scenarios. This system may include a training device and a generation device. The training device is at least used to acquire an interference curve generator, an initial face image, and a reference face image; generate an initial interference curve from the interference curve generator; add interference to the initial face image using the initial interference curve to obtain an adversarial face image; update the interference curve generator based on the loss value between the reference face image and the adversarial face image to obtain an updated interference curve generator; and generate a target interference curve from the updated interference curve generator. The generation device is at least used to obtain physical-world adversarial examples of faces from the target interference curve. The training device can be an application that updates the interference curve generator to obtain an updated interference curve generator, and generates a target interference curve from the updated interference curve generator. The application of the training device is, for example, an artificial intelligence model. The training device can also be a server or terminal device on which the artificial intelligence model is deployed. The generation device can be an application that obtains physical world face adversarial examples from the target interference curve, or a server or terminal device on which the application that obtains physical world face adversarial examples from the target interference curve is installed.
[0030] The solutions provided in this application involve technologies such as Artificial Intelligence (AI), Computer Vision (CV), and Machine Learning (ML), which are specifically illustrated through the following embodiments:
[0031] AI, or Artificial Intelligence, refers to the theories, methods, technologies, and application systems that utilize digital computers or machines controlled by digital computers to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use that knowledge to achieve optimal results. In other words, Artificial Intelligence is a comprehensive technology within computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine capable of reacting in a manner similar to human intelligence. Artificial Intelligence studies the design principles and implementation methods of various intelligent machines, enabling them to possess the functions of perception, reasoning, and decision-making.
[0032] AI technology is a comprehensive discipline encompassing a wide range of fields, including both hardware and software technologies. Fundamental AI technologies generally include sensors, dedicated AI chips, cloud computing, distributed storage, big data processing, operating / interactive systems, and mechatronics. AI software technologies primarily include computer vision, speech processing, natural language processing, and machine learning / deep learning.
[0033] Computer vision (CV) is the science that studies how to enable machines to "see." More specifically, it refers to machine vision, which uses cameras and computers to replace human eyes for target recognition, tracking, and measurement, and then performs image processing to create images more suitable for human observation or transmission to instruments. As a scientific discipline, computer vision studies related theories and technologies, attempting to build artificial intelligence systems capable of extracting information from images or multidimensional data. Computer vision technologies typically include adversarial perturbation generation, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content / behavior recognition, 3D object reconstruction, 3D technology, virtual reality, augmented reality, simultaneous localization and mapping (SLAM), and common biometric recognition technologies such as facial recognition and fingerprint recognition.
[0034] In existing technologies, to prevent physical-world adversarial attacks, adversarial masks, hats, and glasses are commonly used to create adversarial examples of faces for use in physical-world adversarial attacks on facial recognition systems. However, these adversarial examples are easily affected by physical factors such as lighting conditions, distance, and shooting angle, leading to unstable adversarial effects. Moreover, their clear boundaries and obvious adversarial features result in poor concealment, making them easy targets for corresponding defensive measures, thus weakening the effectiveness of the adversarial examples.
[0035] Compared to existing technologies, in this embodiment, a reference face image is obtained as the face image to be attacked. An initial interference curve generated by an interference curve generator is used to add interference to the initial face image, making the generated adversarial face image more similar to the image that might be attacked in the face recognition system, thus possessing higher concealment. Then, by calculating the loss value between the interfered adversarial face image and the reference face image, the loss value is backpropagated to the interference curve generator, gradually updating the parameters of the interference curve generator. This causes the interference curve to grow in the direction of decreasing loss function, enabling the updated interference curve generator to generate an interference curve that better meets expectations. This further increases the concealment of the target interference curve superimposed on the face by the updated interference curve generator, thereby improving the attack effect of the generated physical world face adversarial sample. This physical world face adversarial sample can be used to launch adversarial attacks against face recognition systems, detect vulnerabilities in face recognition systems, and promptly fix vulnerabilities, thereby improving the face recognition system's ability to prevent physical world adversarial attacks and ensuring the security of the face recognition system.
[0036] In some implementations, the training device and the generation device are deployed separately, as shown in the following description. Figure 1 The face adversarial example generation method provided in this application embodiment can be based on Figure 1 The diagram illustrates an implementation of a face adversarial example generation system. This system may include a server 01 and a terminal device 02.
[0037] The server 01 can be a training device, in which the processing program of the training device can be deployed.
[0038] The terminal device 02 can be a generating device, in which a processing program for the generating device can be deployed.
[0039] Server 01 can acquire an interference curve generator, an initial face image, and a reference face image; generate an initial interference curve using the interference curve generator; add interference to the initial face image using the initial interference curve to obtain an adversarial face image; update the interference curve generator based on the loss values of the reference face image and the adversarial face image to obtain an updated interference curve generator; generate a target interference curve using the updated interference curve generator; and send the generated target interference curve to terminal device 02. Terminal device 02 can obtain physical-world adversarial face samples from the target interference curve.
[0040] It should be noted that the server involved in the embodiments of this application can be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
[0041] The terminal devices involved in the embodiments of this application can be devices that provide voice and / or data connectivity to users, handheld devices with wireless connectivity, or other processing devices connected to a wireless modem. Examples include mobile phones (or "cellular" phones) and computers with mobile terminals, such as portable, pocket-sized, handheld, computer-embedded, or vehicle-mounted mobile devices that exchange voice and / or data with a wireless access network. Examples include Personal Communication Service (PCS) phones, cordless phones, Session Initiation Protocol (SIP) phones, Wireless Local Loop (WLL) stations, Personal Digital Assistants (PDAs), and other similar devices.
[0042] It should be noted that in the specific implementation of this application, the data related to users, such as reference face images, initial face images, training adversarial face images, and face images collected by the face recognition system, are involved. When the embodiments of this application are applied to specific products or technologies, user permission or consent is required, and the collection, use and processing of related data must comply with the relevant laws, regulations and standards of the relevant countries and regions.
[0043] Reference Figure 2 , Figure 2This is a flowchart illustrating a method for generating adversarial examples of faces provided in an embodiment of this application. The method can be executed by a face adversarial example generation device and can be applied to face recognition adversarial attack scenarios. It involves acquiring an interference curve generator, an initial face image, and a reference face image; generating an initial interference curve using the interference curve generator; adding interference to the initial face image using the initial interference curve to obtain an adversarial face image; updating the interference curve generator based on the loss values between the reference face image and the adversarial face image to obtain an updated interference curve generator; and generating a target interference curve using the updated interference curve generator to obtain physical-world face adversarial examples from the target interference curve.
[0044] The method includes steps 101-105:
[0045] Step 101: Obtain the interference curve generator, the initial face image, and the reference face image.
[0046] In this context, an interference curve generator refers to a generator used to generate interference curves, which are curves used to simulate physical-world attacks. For example, interference curves can be interference signals or texture perturbations that can be added to an image. These interference curves can be generated based on different mathematical models such as noise models, curve models, and texture models.
[0047] The reference face image refers to the image used for comparison in face recognition. It serves as a benchmark for comparison and calculation with the adversarial face image. The initial face image refers to the original image used to generate the adversarial face image.
[0048] In some implementations, the interference curve is a Bézier curve, and the interference curve generator is a Bézier curve generator. A Bézier curve is a mathematical curve commonly used in computer graphics and computer-aided design. It consists of a set of control points and a coefficient vector, used to draw a smooth curve. A Bézier curve generator is a tool or algorithm used to generate Bézier curves. It can calculate and draw the corresponding Bézier curve based on given control points and parameters. For example, a Bézier curve generator can include, but is not limited to, generators built into graphics editing software, or Bézier curve function methods. Bézier curves can simulate the natural curves of a human face contour, making the interference effect more realistic. Compared to simple interference signals or texture perturbations, Bézier curves can better preserve the overall shape and structure of the face, reducing the recognizability of the interfered image, increasing concealment and attack effectiveness. Furthermore, Bézier curves for facial interference allow for precise control of the location and degree of interference, achieving perturbation effects at different locations and to varying degrees.
[0049] In some implementations, the interference curve is a differentiable Bézier curve. A differentiable Bézier curve can be represented as a matrix, and its derivative can be calculated by applying differential operators (such as differentiation). By combining these derivatives, properties such as the curve's slope and curvature can be calculated. Furthermore, differentiable Bézier curves can be optimized and approximated using curvature update algorithms.
[0050] Step 102: Generate the initial interference curve using the interference curve generator.
[0051] The initial interference curve refers to the initial curve used to generate adversarial face images. The initial interference curve can be used as the starting point for adversarial face image generation. The initial interference curve can be randomly generated, predefined, or generated using a specific optimization algorithm.
[0052] For example, the number of control points required for the Bézier curve can be predefined according to the application scenario or actual needs. These control points will determine the shape of the generated curve, and then the Bézier curve generator will generate the initial Bézier curve based on the control point information.
[0053] In some methods, the interference curve generator can be initialized, and the initialized interference curve generator can then generate the initial interference curve. For example, the parameters of the Bézier curve generator can be initialized, and then the initialized Bézier curve generator can generate the initial Bézier curve, i.e., the initial interference curve, based on the control point information.
[0054] Step 103: Add interference to the initial face image using the initial interference curve to obtain an adversarial face image.
[0055] Among them, adversarial face images refer to face images with added interference.
[0056] For example, the control points of an initial Bézier curve can be matched with pixels in a face image. The initial Bézier curve is then superimposed onto the matching pixels in the face image, thus generating an adversarial face image. Understandably, by superimposing the initial Bézier curve onto the face image, adversarial textures are created in specific areas of the face. This allows the face recognition model to identify the features of the adversarial face image as features of the reference face image, increasing the stealth and attack effectiveness of the adversarial face image. Therefore, in adversarial attacks on face recognition systems, adversarial examples can evade recognition or be misidentified as another person by adding interference. Thus, in generating adversarial face examples, a reference face image can be obtained as the target face image. An initial interference curve is used to add interference to the initial face image, making the generated adversarial face image more similar to the potentially attacked images in the face recognition system. This allows the generated adversarial face example to detect vulnerabilities where the face recognition system might misidentify the adversarial example as another person, enabling timely patching of these vulnerabilities and increasing the security of the face recognition system.
[0057] In some implementations, the initial interference curve can be converted into an image form, which can then be fused pixel-by-pixel with the initial face image to quickly obtain an adversarial face image. Specifically, adding interference to the initial face image using the initial interference curve to obtain an adversarial face image includes:
[0058] An interference image is generated from the initial interference curve;
[0059] The first image pixel of the interference image is fused with the second image pixel of the initial face image pixel by pixel to obtain the adversarial face image.
[0060] The first image pixel and the second image pixel are pixels in the interference image and the initial face object, respectively.
[0061] For example, the initial Bézier curve can be converted into a perturbation image of the same size as the initial face image, such as... Figure 3The process shown involves overlaying an interfering image onto an initial face image to obtain an adversarial face image. A blank image can be created with a transparent background. The initial Bézier curve is then plotted on this blank image to obtain the interfering image b. The interfering image b can be obtained by multiplying the first pixel in the interfering image b pixel-by-pixel with the second pixel in the initial face image att using the formula adv = att ⊙ b. This multiplication yields the pixel value of the adversarial face image adv, which is then used to generate the adversarial face image overlaid with the initial interfering curve. For example, the initial Bézier curve in the blank image can be set to black (RGB(0,0,0)). Therefore, when the first pixel corresponding to the initial Bézier curve in the interfering image is multiplied by the corresponding second pixel in the initial face image, the value of the second pixel becomes RGB(0,0,0). This allows all the corresponding second pixels in the initial face image to be updated to black, thus overlaying the black initial interfering curve onto the initial face image to obtain the adversarial face image. Since the background pixel values (Alpha channel) of a blank image are usually represented as 0, they do not have any effect on the result of the product. This is equivalent to completely ignoring the contents of the transparent area. Only the parts that do not overlap with the transparent area, i.e. the opaque areas corresponding to the initial Bézier curve, will be multiplied pixel by pixel to calculate the superposition effect.
[0062] In some implementations, the line style, thickness, and color of the initial interference curve can be set, and control points can be adjusted, depending on the application scenario or actual needs, to increase the concealment and attack effect of the adversarial face image. For example, after generating the initial Bézier curve, the thickness of the curve can be adjusted according to the thickness of lines such as the facial outline in the initial face image, or the color of the curve can be updated according to the hue of the initial face image, so that the updated initial Bézier curve is more similar to the initial face image, and then the adversarial face image is generated using the updated initial Bézier curve. By designing a Bézier curve thickness adjustment function, the line thickness can be dynamically adjusted during algorithm optimization, thereby minimizing the thickness of the curve while ensuring that the adversarial similarity meets the standard. It should be noted that, in order to facilitate pixel-by-pixel fusion, the line color of the updated initial Bézier curve can be set to black during pixel fusion. After generating the adversarial face image, the color of the second image pixel corresponding to the initial interference curve in the image is set to the aforementioned curve color updated according to the hue.
[0063] In some implementations, to reduce the recognizability of the interfered image, increase its concealment and attack effectiveness, the size of the face image can be pre-adjusted so that the obtained initial face image matches the interfering image, and the Bézier curve can be better superimposed on the face. Specifically, before obtaining the adversarial face image by adding interference to the initial face image using the initial interference curve, the process further includes:
[0064] Obtain the original face image;
[0065] The original face image is adjusted to match the interfering image. The adjustment process includes at least one of cropping, scaling, and padding.
[0066] For example, when defining the control points of a Bézier curve, a mapping relationship between the control points and facial landmarks can be established in advance. First, an original face image is captured using an image acquisition device such as a camera. Then, a facial landmark detection algorithm, such as a neural network model based on a convolutional network, is used to detect faces in the original face image, identifying facial landmarks. Next, the original face image is scaled to match the identified facial landmarks with the control points of the Bézier curve. Finally, the scaled original face image is cropped and padded to ensure that the resulting initial face image is the same size as the interfering image.
[0067] Step 104: Update the interference curve generator based on the loss values of the reference face image and the adversarial face image to obtain the updated interference curve generator.
[0068] For example, based on the application scenario or actual needs, a predefined loss function, such as Mean Squared Error or Perceptual Loss, can be used to calculate the loss value between the reference face image and the adversarial face image. This loss value is then backpropagated to the interference curve generator. By calculating the gradient of the loss function with respect to the parameters of the interference curve generator, an optimization algorithm, such as Adam (Adaptive Moment Estimation), is used to iteratively update the parameters of the interference curve generator based on the gradient information. Depending on the direction and magnitude of the gradient, the parameters can be updated along the gradient descent direction until the loss value falls below a preset threshold. By gradually updating the parameters of the interference curve generator through gradient optimization, the interference curve grows in the direction of decreasing loss function, enabling the updated interference curve generator to generate interference curves that better meet the desired outcome.
[0069] In some implementations, the similarity between different encoded features can be measured by cross-calculating the first encoded vector of the reference face image and the second encoded vector of the adversarial face image, thereby capturing more feature differences. A difference value can be obtained by calculating the difference between the first feature representation of the reference face image and the second feature representation of the adversarial face image. Combining the encoded cross value and the feature representation difference value yields a more accurate first loss value. This loss value provides accurate feature representation and measurement standards for the generation of adversarial examples, thereby improving the attack effect of adversarial examples. When adversarial examples are used to launch adversarial attacks against face recognition systems, this enhances the face recognition system's ability to prevent physical-world adversarial attacks and ensures the security of the face recognition system. Specifically, the loss value includes the first loss value, and the method further includes:
[0070] Obtain the first encoding vector of the reference face image, the first feature representation of the reference face image, the second encoding vector of the adversarial face image, and the second feature representation of the adversarial face image;
[0071] The first and second coding vectors are cross-calculated to obtain the coding cross value;
[0072] The difference between the first feature representation and the second feature representation is calculated to obtain the feature representation difference value;
[0073] The first loss value is calculated from the encoded cross value and the feature representation difference value.
[0074] Here, the encoded vector refers to the vector representation obtained by encoding the image. The first encoded vector and the second encoded vector refer to the vector representations obtained by encoding the reference face image and the adversarial face image, respectively. For example, an iso-coding network can be used to map the image to a low-dimensional vector space to obtain the encoded vector of the image. This application does not limit the specific form of the encoding network used to encode the image; for example, the encoding network can be an autoencoder, a convolutional neural network (CNN), or a variational autoencoder (VAE), etc.
[0075] Feature representation refers to the vector representation of abstract features extracted from an image. Compared to the encoding vector, feature representation usually has a higher dimension and can provide more detailed and richer image information. The first encoding vector and the second encoding vector refer to the vector representations of abstract features extracted from the reference face image and the adversarial face image, respectively. For example, a pre-trained face recognition model can be used to extract features from an image to obtain the feature representation of the image. This application does not limit the specific form of the pre-trained face recognition model used for feature extraction from the image. For example, the pre-trained face recognition model can be ArcFace (arc surface loss), MobileFaceNet (mobile face network), InsightFace (insight face), CenterFace (center face), SphereFace (spherical loss), or CosFace (cosine surface loss), etc.
[0076] Cross-computation refers to the process of combining multiple encoding vectors, such as through multiplication or summation. Difference calculation is the process of obtaining the difference in feature representations through subtraction.
[0077] For example, an encoding network is used to encode a reference face image or an adversarial face image to obtain an encoding vector. Then, a pre-trained face recognition model is used to extract features from the encoding vector, resulting in a feature representation of the image. Next, a cross-calculation is performed between the first encoding vector of the reference face image and the second encoding vector of the adversarial face image, and the difference between the feature representations of the reference face image and the adversarial face image is calculated. The cross-value obtained from the cross-calculation and the difference value obtained from the difference calculation are then summed to obtain the first loss value. Thus, the cross-value represents the relationship between the reference face image and the adversarial face image, and the difference value represents the distance between the feature representations in the face recognition model.
[0078] In some implementations, the coding cross value can be obtained by calculating the inner product of the first and second coding vectors, and the feature representation difference value can be obtained by calculating the squared difference between the first and second feature representations. Alternatively, the first loss value can be obtained by weighted summation of the coding cross value and the feature representation difference value. For example, it can be expressed as loss = α·(fadv*fvic) + β·mean((fadv_inter - fvic_inter) 2The first loss value (loss) is calculated, where fvic represents the first encoded vector, fadv represents the second encoded vector, fvic_inter represents the first feature representation, fadv_inter represents the second feature representation, * represents the inner product, mean() represents calculating the average, · represents multiplication, and α and β are weights. α and β can be set according to the application scenario, actual needs, or experience; for example, they can be directly set to 1. Specifically, the inner product between the first and second encoded vectors is calculated by fadv*fvic to better separate feature vectors from different categories in the feature space, enhancing their discriminative power. This is achieved by using mean((fadv_inter-fvic_inter)). 2 The square of the interpolation between the second feature representation and the first feature representation is calculated, and then the mean of these squared differences is calculated to constrain the compactness of the feature distance of the intermediate layer of the face recognition model in the feature space, so that feature representations from the same category are closer. Then, the weights α and β are applied to fadv*fvic and mean((fadv_inter-fvic_inter). 2 A weighted summation is performed to balance the encoding cross-values and feature representation differences, resulting in a first loss value. This first loss value can improve the feature discriminativeness and feature compactness in the face adversarial example generation task, thereby enhancing the attack effect of face adversarial examples. When face adversarial examples are used to launch adversarial attacks against face recognition systems, the ability of face recognition systems to prevent physical adversarial attacks is improved, thus ensuring the security of face recognition systems.
[0079] In some implementations, during the iterative update of the interference curve generator based on the loss value, a weighted sum of the encoded cross-value and the feature representation difference value can be calculated based on the encoded cross-value and the feature representation difference value calculated during the first update. For example, during the iterative update of the interference curve generator, for the first loss value calculated in the first iteration, the encoded cross-value and the feature representation difference value used to calculate the loss value can be obtained. The weight α corresponding to the encoded cross-value can be calculated by dividing the encoded cross-value by (encoded cross-value + feature representation difference value), and the weight β corresponding to the encoded cross-value can be calculated by dividing the feature representation difference value by (encoded cross-value + feature representation difference value), where / represents division.
[0080] In some implementations, the feature representation can be the feature representation of the input image by any intermediate layer of the face recognition model. Here, an intermediate layer refers to a hidden layer between the input layer and the output layer of the face recognition model.
[0081] In some implementations, a face recognition model can be obtained to extract features from images and obtain feature representations, depending on the application scenario, in order to improve the accuracy of feature extraction in that application scenario. For example, a face dataset for the current application scenario, such as application scenario A, can be obtained, and a candidate face recognition model can be trained using this face dataset to optimize it to obtain a face recognition model suitable for application scenario A, which can then be used in the image feature extraction process of this application embodiment.
[0082] In some implementations, depending on the application scenario, one face recognition model can be selected from multiple candidate face recognition models to extract features from an image, thereby improving the accuracy of feature extraction in that application scenario. For example, multiple existing face recognition models, such as face recognition model 1, face recognition model 2, and face recognition model 3, can be used as candidate face recognition models. The candidate face recognition models are trained using a face dataset from the current application scenario, such as application scenario A. The performance of the trained face recognition models 1, 2, and 3 is compared using evaluation metrics such as precision and recall. The model with the best performance, such as trained face recognition model 2, is then used for feature extraction in this embodiment. In this way, multiple face recognition models are integrated as alternative models through meta-learning. During the optimization process, the contributions of different face recognition models to minimizing the loss function are scored, and the face recognition model with the highest score is selected as the alternative model for that round of optimization. This approach not only significantly improves adversarial similarity but also increases the transferability of adversarial examples.
[0083] In some implementations, the stealth score of the adversarial face image can be used as a second loss value to guide the realism of the generated adversarial face image, increasing its stealth and attack effectiveness, thereby improving the attack effect of the adversarial face example. This enhances the face recognition system's ability to prevent physical-world adversarial attacks when using adversarial face examples to launch such attacks, thus ensuring the security of the face recognition system. Specifically, the loss value also includes a second loss value, and the method further includes:
[0084] Obtain a pre-trained covert discriminator, which is trained using covert samples, including the adversarial face images used for training and their covert scores.
[0085] By using a pre-trained covert discriminator, covert prediction is performed on adversarial face images to obtain covert scores for adversarial face images.
[0086] The stealth score of the adversarial face image is used as the second loss value.
[0087] Here, "stealth" refers to the property of adversarial face images being difficult to detect. A stealth discriminator is a model or algorithm used to judge the stealth level of an adversarial face image. This application does not limit the specific form of the stealth discriminator; for example, it can be a discriminator in Generative Adversarial Networks (GANs), a classifier built using deep learning neural networks, or other neural network structures. Stealth scores can be used to measure the degree to which adversarial face images are difficult to detect. Generally, the higher the stealth score, the less likely the adversarial face image is to be detected, i.e., the more aggressive the adversarial face image is.
[0088] For example, discriminators or classifiers in existing technologies can be pre-trained using adversarial face images as training samples. During training, the occultation scores of the adversarial face images used for training are used as the sample labels for the training samples, enabling the trained discriminator to predict the occultation scores of adversarial face images. Thus, the trained discriminator is used as an occultation discriminator. Using loss = D(adv), the occultation of adversarial face images is predicted, yielding the occultation score of the adversarial face image, i.e., the second loss value, loss. Here, D(adv) represents the occultation predictor, and adv represents the adversarial face image. Through the occultation discriminator, the loss function is minimized during the update of the interference curve generator, gradually improving the occultation of the interference curve superimposed on the face and the attack effect.
[0089] In some implementations, the total loss value can be obtained by combining the first loss value and the second loss value. Based on this total loss value, the interference curve generator can be updated to obtain the updated interference curve generator. The combination method can be summation or weighted summation. The weights can be set according to the application scenario or actual needs, adjusted based on the training effect or experience of the interference curve generator, or simply set to 1.
[0090] Step 105: The updated interference curve generator generates a target interference curve to obtain physical world face adversarial samples from the target interference curve.
[0091] The target interference curve refers to the interference curve generated by the updated interference curve generator.
[0092] For example, an updated interference curve generator can generate a target interference curve, which can then be superimposed on a face image to generate adversarial examples of faces in the physical world. These adversarial examples generated using the target interference curve from the updated generator possess higher concealment and attack effectiveness. They can detect vulnerabilities in face recognition systems and promptly patch them, thereby enhancing the face recognition system's ability to prevent physical world adversarial attacks and ensuring its security.
[0093] In some implementations, a three-dimensional interference curve mask corresponding to the target interference curve can be generated to attack the facial recognition system in the physical world, thereby improving the facial recognition system's ability to prevent physical-world adversarial attacks and ensuring the security of the facial recognition system. Specifically, the interference curve includes a Bézier curve, and the method further includes:
[0094] A three-dimensional interference curve model is generated from the target interference curve;
[0095] Generate a 3D interference curve mask corresponding to the 3D interference curve model;
[0096] Facial images of people wearing 3D interference curve masks were collected as physical-world adversarial face samples.
[0097] For example, 3D modeling software such as Blender and SolidWorks can be used to construct a 3D Bézier curve model, i.e., a 3D interference curve model, of the target Bézier curve. Then, a 3D printing device, such as a 3D printer, can be used to print the 3D Bézier curve model, thus obtaining... Figure 4 The linear 3D Bézier curve mask shown is a 3D interference curve mask. This mask, generated by an adversarial tester, is used to capture facial images from at least one angle via image acquisition devices such as a camera, serving as physical-world adversarial face samples. For example, as... Figure 5 The physical-world adversarial face sample shown is a frontal facial image of the person being tested. A monochrome curved 3D Bézier mask, printed using 3D printing equipment, is used. The printing process is simple, the materials are inexpensive, and the thickness and color attributes of the adversarial device (the 3D Bézier mask) can be controlled, minimizing its susceptibility to environmental influences. Simultaneously, linear devices are used as the adversarial perturbation carrier. Linear devices have the characteristics of a large influence area and a small size, which can greatly reduce the occlusion of the face by the perturbation device. Therefore, the adversarial perturbation influence area can be expanded to the entire face with minimal occlusion. Compared to local perturbations of the same area, this type of perturbation can interfere with the most facial features.
[0098] In some implementations, the line style, thickness, and color of the target interference curve can be set, and control points can be adjusted, depending on the application scenario or actual needs. For example, the color of the target interference curve can be set to a single color to generate a single-color curve-type 3D interference curve mask. Alternatively, the color of the target Bézier curve can be set to black to obtain a black curve-type 3D Bézier curve mask through 3D printing equipment.
[0099] In some implementations, the initial face image is the face image of the wearer wearing a 3D interference curve mask, to increase the fit between the 3D interference curve mask and the wearer, thereby enhancing the effectiveness of adversarial examples of faces in the physical world. For example, an initial face image of adversarial tester A can be acquired to adjust the interference curve generator. Then, adversarial tester A wears the generated 3D interference curve mask, and face images from at least one angle are captured by an image acquisition device such as a camera as adversarial examples of faces in the physical world. Similarly, when there are multiple adversarial testers, an initial face image can be acquired for each adversarial tester to generate a 3D interference curve mask corresponding to each adversarial tester, thereby acquiring adversarial examples of faces in the physical world corresponding to each adversarial tester.
[0100] In some implementations, a first loss value can be calculated for each intermediate layer in the face recognition model, and the interference curve generator can be iteratively updated based on the first loss value for each intermediate layer until the loss value is lower than a preset threshold, resulting in an updated interference curve generator for each intermediate layer. The updated interference curve generator for each intermediate layer generates a target curve for that intermediate layer, and the target curve for each intermediate layer yields a physical-world adversarial face sample. The interference effects of all physical-world adversarial face samples for all intermediate layers are compared, and the one with the best interference effect is selected as the target physical-world adversarial face sample. Since different intermediate layers may contribute differently to the output of the face recognition model, training a corresponding updated interference curve generator for each intermediate layer and selecting the physical-world adversarial face sample obtained from the one with the best effect can improve the attack effect of adversarial face samples. This enhances the face recognition system's ability to prevent physical-world adversarial attacks when using adversarial face samples to launch adversarial attacks, thus ensuring the security of the face recognition system. For example, a face recognition model may include intermediate layers 1 to 3. The model can input the first encoded vector of a reference face image and the second encoded vector of an adversarial face image to obtain the first feature representation 1 and the second feature representation 1 output by intermediate layer 1, the first feature representation 2 and the second feature representation 2 output by intermediate layer 2, and the first feature representation 3 and the second feature representation 3 output by intermediate layer 3. For any intermediate layer, such as intermediate layer 1, a first loss value 1 can be calculated based on the first encoded vector, the second encoded vector, the first feature representation 1, and the second feature representation 1. The interference curve generator is iteratively updated based on the first loss value 1 until the loss value is lower than a preset threshold, resulting in the updated interference curve generator 1 corresponding to intermediate layer 1. This process is repeated to obtain the updated interference curve generator 2 corresponding to intermediate layer 2 and the updated interference curve generator 3 corresponding to intermediate layer 3. Interference curves 1, 2, and 3 are generated using updated interference curve generators 1, 2, and 3, respectively. These interference curves then generate physical-world adversarial sample 1, 2, and 3, respectively. Adversarial attacks are then performed on the face recognition system using these three samples, and the attack effectiveness is compared. The sample with the best attack performance, such as physical-world adversarial sample 1, is used to attack the face recognition system.
[0101] The embodiments of this application can be implemented by a face adversarial example generation system, which includes an interference curve generator, a concealment discriminator, a face recognition model, an optimizer, 3D modeling software, and 3D printing equipment. Figure 6 The workflow of the face adversarial example generation system is shown. The system updates the parameters of the interference curve generator during the training phase and then uses the updated generator in the generation phase to generate physical-world face adversarial examples. Specifically, during the training phase, the interference curve generator generates an initial interference curve to add interference to the initial face image, resulting in an adversarial face image. A second loss value for the adversarial face image is calculated using a concealment discriminator, and a first loss value is calculated using a face recognition model to compare the reference face image and the adversarial face image. The first and second loss values are summed to obtain the final loss value. Based on the loss value, the optimizer updates the interference curve generator, resulting in an updated generator. The updated generator then generates the target interference curve. A three-dimensional interference curve model of the target interference curve is constructed using 3D modeling software and printed using 3D printing equipment, resulting in a linear three-dimensional interference curve mask. The generated three-dimensional interference curve mask is worn by adversarial testers, and face images from at least one angle are captured using image acquisition equipment as physical-world face adversarial examples.
[0102] Therefore, the face adversarial example generation system provided in this application can improve the robustness of face recognition systems by generating highly aggressive physical world face adversarial examples. Typically, face recognition systems can only recognize faces under normal conditions and cannot cope with complex physical attacks, such as those involving masks or glasses. Physical world adversarial attacks can be used to evaluate the robustness of face recognition systems, helping to improve system algorithms and increase recognition accuracy. Face recognition systems are widely used in industries such as security, finance, and logistics, involving a large amount of sensitive information and personal privacy. Physical world adversarial attacks can help enterprises and organizations assess the security and confidentiality of face recognition systems, reducing the risk of attacks and losses.
[0103] In this embodiment, an interference curve generator, an initial face image, and a reference face image are obtained; an initial interference curve is generated by the interference curve generator; interference is added to the initial face image using the initial interference curve to obtain an adversarial face image; the interference curve generator is updated based on the loss values of the reference face image and the adversarial face image to obtain an updated interference curve generator; a target interference curve is generated by the updated interference curve generator to obtain physical world face adversarial samples from the target interference curve.
[0104] Therefore, this application embodiment obtains a reference face image as the face image to be attacked. An initial interference curve generated by an interference curve generator is used to add interference to the initial face image, making the generated adversarial face image more similar to the potentially attacked image in the face recognition system, thus possessing higher concealment. Then, by calculating the loss value between the interfered adversarial face image and the reference face image, the loss value is backpropagated to the interference curve generator, gradually updating the parameters of the interference curve generator. This causes the interference curve to grow in the direction of decreasing loss function, enabling the updated interference curve generator to generate an interference curve that better meets expectations. This further increases the concealment of the target interference curve superimposed on the face by the updated interference curve generator, thereby improving the attack effect of the generated physical world face adversarial sample. This physical world face adversarial sample can be used to launch adversarial attacks against the face recognition system, detect vulnerabilities in the face recognition system, and promptly fix these vulnerabilities, thereby improving the face recognition system's ability to prevent physical world adversarial attacks and ensuring the security of the face recognition system.
[0105] The above describes a method for generating adversarial examples of faces in the embodiments of this application. The following describes the apparatus (e.g., a server) for generating adversarial examples of faces that performs the above method.
[0106] See Figure 7 ,like Figure 7 The diagram illustrates a structural schematic of a face adversarial example generation device. This device can be applied to servers in face recognition scenarios requiring defense against adversarial attacks. It acquires an interference curve generator, an initial face image, and a reference face image. The interference curve generator generates an initial interference curve, which is used to add interference to the initial face image, resulting in an adversarial face image. Based on the loss value between the reference face image and the adversarial face image, the interference curve generator is updated, resulting in an updated interference curve generator. The updated interference curve generator then generates a target interference curve, from which physical-world face adversarial examples are obtained. The face adversarial example generation device in this embodiment can achieve the above-described... Figure 2 The steps of the face adversarial example generation method executed in the corresponding embodiments are described above. The functions implemented by the face adversarial example generation device can be implemented by hardware or by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above functions, and the modules can be software and / or hardware. The face adversarial example generation device may include an input / output module 601 and a processing module 602. The functional implementation of the processing module 602 and the input / output module 601 can be found in [reference]. Figure 2 The operations performed in the corresponding embodiments will not be described in detail here. For example, the processing module 602 can be used to control the sending, receiving, and acquiring operations of the input / output module 601.
[0107] The input / output module 601 is configured to acquire an interference curve generator, an initial face image, and a reference face image;
[0108] The processing module 602 is configured to generate an initial interference curve using an interference curve generator; add interference to an initial face image using the initial interference curve to obtain an adversarial face image; update the interference curve generator based on the loss values of the reference face image and the adversarial face image to obtain an updated interference curve generator; and generate a target interference curve using the updated interference curve generator to obtain a physical world face adversarial sample from the target interference curve.
[0109] In some implementations, the processing module 602 may specifically be used for:
[0110] An interference image is generated from the initial interference curve;
[0111] The first image pixel of the interference image is fused with the second image pixel of the initial face image pixel by pixel to obtain the adversarial face image.
[0112] In some implementations, the loss value includes a first loss value, and the processing module 602 can also be used for:
[0113] Obtain the first encoding vector of the reference face image, the first feature representation of the reference face image, the second encoding vector of the adversarial face image, and the second feature representation of the adversarial face image;
[0114] The first and second coding vectors are cross-calculated to obtain the coding cross value;
[0115] The difference between the first feature representation and the second feature representation is calculated to obtain the feature representation difference value;
[0116] The first loss value is calculated from the encoded cross value and the feature representation difference value.
[0117] In some implementations, the loss value also includes a second loss value, and the processing module 602 can also be used for:
[0118] Obtain a pre-trained covert discriminator, which is trained using covert samples, including the adversarial face images used for training and their covert scores.
[0119] By using a pre-trained covert discriminator, covert prediction is performed on adversarial face images to obtain covert scores for adversarial face images.
[0120] The stealth score of the adversarial face image is used as the second loss value.
[0121] In some implementations, the interference curve includes a Bézier curve, and the processing module 602 can also be used for:
[0122] A three-dimensional interference curve model is generated from the target interference curve;
[0123] Generate a 3D interference curve mask corresponding to the 3D interference curve model;
[0124] Facial images of people wearing 3D interference curve masks were collected as physical-world adversarial face samples.
[0125] In this embodiment, the processing module 602 uses a reference face image as the target face image. An initial interference curve generated by an interference curve generator is used to add interference to the initial face image, making the generated adversarial face image more similar to the potentially attacked image in the face recognition system, thus enhancing its stealth. The loss value between the interfered adversarial face image and the reference face image is then calculated and backpropagated to the interference curve generator. The parameters of the interference curve generator are gradually updated, causing the interference curve to grow in the direction of decreasing loss function. This allows the updated interference curve generator to generate an interference curve that better meets expectations, further increasing the stealth of the target interference curve superimposed on the face, thereby improving the attack effect of the generated physical world face adversarial sample. This physical world face adversarial sample can be used to launch adversarial attacks against the face recognition system, detect vulnerabilities in the face recognition system, and promptly fix these vulnerabilities, thereby improving the face recognition system's ability to prevent physical world adversarial attacks and ensuring the security of the face recognition system.
[0126] The face adversarial example generation device 60 in this application embodiment has been described above from the perspective of modular functional entities. The face adversarial example generation device in this application embodiment will be described below from the perspective of hardware processing.
[0127] This application also provides a computing device, such as... Figure 8 As shown, it includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the above-described method for generating adversarial examples of faces.
[0128] It should be noted that, Figure 7 The physical device corresponding to the input / output module 601 shown can be a transceiver, radio frequency circuit, communication module, and input / output (I / O) interface, etc., and the physical device corresponding to the processing module 602 can be a processor.
[0129] Figure 7 The devices shown can all have the following characteristics: Figure 8 The structure shown, when Figure 7The face adversarial example generation device 60 shown has, as Figure 8 When the structure shown is used, Figure 8 The processor and transceiver in the device can perform the same or similar functions as the processing module 602 and input / output module 601 provided in the aforementioned device embodiments. Figure 8 The memory stores the computer programs that the processor needs to call when executing the above-mentioned method for generating adversarial examples of faces.
[0130] This application also provides a terminal device, such as... Figure 9 As shown, for ease of explanation, only the parts related to the embodiments of this application are shown. For specific technical details not disclosed, please refer to the method section of the embodiments of this application. The terminal device can be any terminal device including mobile phones, tablets, personal digital assistants (PDAs), point-of-sale (POS) terminals, in-vehicle computers, etc. Taking a mobile phone as an example:
[0131] Figure 9 This diagram illustrates a partial structural representation of a mobile phone related to the terminal device provided in this embodiment. (Reference) Figure 9 The mobile phone includes components such as a radio frequency (RF) circuit 1010, a memory 1020, an input unit 1030, a display unit 1040, a sensor 1050, an audio circuit 1060, a wireless fidelity (WiFi) module 1070, a processor 1080, and a power supply 1090. Those skilled in the art will understand that... Figure 9 The mobile phone structure shown does not constitute a limitation on the mobile phone and may include more or fewer components than shown, or combine certain components, or have different component arrangements.
[0132] The following is combined Figure 9 A detailed introduction to each component of a mobile phone:
[0133] The RF circuit 1010 can be used for receiving and transmitting signals during information transmission or calls. Specifically, it receives downlink information from the base station and processes it with the processor 1080; additionally, it transmits uplink data to the base station. Typically, the RF circuit 1010 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low-noise amplifier (LNA), a duplexer, etc. Furthermore, the RF circuit 1010 can also communicate wirelessly with networks and other devices. The aforementioned wireless communication can use any communication standard or protocol, including but not limited to Global System for Mobile Communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, and Short Messaging Service (SMS).
[0134] The memory 1020 can be used to store software programs and modules. The processor 1080 executes various mobile phone functions and data processing by running the software programs and modules stored in the memory 1020. The memory 1020 may mainly include a program storage area and a data storage area. The program storage area may store the operating system, applications required for at least one function (such as sound playback function, image playback function, etc.), etc.; the data storage area may store data created according to the use of the mobile phone (such as audio data, phonebook, etc.). In addition, the memory 1020 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device.
[0135] The input unit 1030 can be used to receive input numerical or character information, and to generate key signal inputs related to user settings and function control of the mobile phone. Specifically, the input unit 1030 may include a touch panel 1031 and other input devices 1032. The touch panel 1031, also known as a touch screen, can collect touch operations performed by the user on or near it (such as operations performed by the user using a finger, stylus, or any suitable object or accessory on or near the touch panel 1031), and drive the corresponding connection devices according to a pre-set program. Optionally, the touch panel 1031 may include two parts: a touch detection device and a touch controller. The touch detection device detects the user's touch position and the signal generated by the touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into touch point coordinates, and sends it to the processor 1080, and can also receive and execute commands sent by the processor 1080. In addition, the touch panel 1031 can be implemented using various types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 1031, the input unit 1030 may also include other input devices 1032. Specifically, other input devices 1032 may include, but are not limited to, one or more of the following: physical keyboard, function keys (such as volume control buttons, power buttons, etc.), trackball, mouse, joystick, etc.
[0136] The display unit 1040 can be used to display information input by the user or information provided to the user, as well as various menus of the mobile phone. The display unit 1040 may include a display panel 1041, which may optionally be configured as a liquid crystal display (LCD), organic light-emitting diode (OLED), or similar display. Further, a touch panel 1031 may cover the display panel 1041. When the touch panel 1031 detects a touch operation on or near it, it transmits the information to the processor 1080 to determine the type of touch event. Subsequently, the processor 1080 provides corresponding visual output on the display panel 1041 based on the type of touch event. Although in Figure 9 In this embodiment, the touch panel 1031 and the display panel 1041 are two separate components to realize the input and output functions of the mobile phone. However, in some embodiments, the touch panel 1031 and the display panel 1041 can be integrated to realize the input and output functions of the mobile phone.
[0137] The mobile phone may also include at least one sensor 1050, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor. The ambient light sensor can adjust the brightness of the display panel 1041 according to the ambient light level, and the proximity sensor can turn off the display panel 1041 and / or the backlight when the phone is moved to the ear. As a type of motion sensor, an accelerometer sensor can detect the magnitude of acceleration in various directions (generally three axes). When stationary, it can detect the magnitude and direction of gravity and can be used for applications that recognize the phone's posture (such as landscape / portrait switching, related games, magnetometer posture calibration), vibration recognition-related functions (such as pedometer, taps), etc. Other sensors that may be configured in the mobile phone, such as gyroscopes, barometers, hygrometers, thermometers, and infrared sensors, will not be described in detail here.
[0138] The audio circuit 1060, speaker 1061, and microphone 1062 provide an audio interface between the user and the mobile phone. The audio circuit 1060 converts the received audio data into electrical signals and transmits them to the speaker 1061, where the speaker 1061 converts them into sound signals for output. On the other hand, the microphone 1062 converts the collected sound signals into electrical signals, which are then received by the audio circuit 1060, converted into audio data, and then processed by the processor 1080 before being transmitted via the RF circuit 1010 to, for example, another mobile phone, or the audio data can be output to the memory 1020 for further processing.
[0139] Wi-Fi is a short-range wireless transmission technology. Through the Wi-Fi module 1070, mobile phones can help users send and receive emails, browse web pages, and access streaming media, providing users with wireless broadband internet access. Although Figure 9 The Wi-Fi module 1070 is shown, but it is understood that it is not an essential component of a mobile phone and can be omitted as needed without changing the essence of the invention.
[0140] The processor 1080 is the control center of the mobile phone, connecting various parts of the phone through various interfaces and lines. It executes software programs and / or modules stored in the memory 1020 and calls data stored in the memory 1020 to perform various functions and process data, thereby providing overall monitoring of the phone. Optionally, the processor 1080 may include one or more processing units; optionally, the processor 1080 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user interface, and applications, and the modem processor mainly handles wireless communication. It is understood that the aforementioned modem processor may also not be integrated into the processor 1080.
[0141] The mobile phone also includes a power supply 1090 (such as a battery) that supplies power to various components. Optionally, the power supply can be logically connected to the processor 1080 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system.
[0142] Although not shown, mobile phones may also include a camera, Bluetooth module, etc., which will not be described in detail here.
[0143] In this embodiment of the application, the processor 1080 included in the mobile phone also has a process for controlling and executing the face adversarial sample generation method executed by the face adversarial sample generation device.
[0144] This application also provides a server; please refer to [link / reference]. Figure 10 , Figure 10 This is a schematic diagram of a server structure provided in an embodiment of this application. The server 1100 can vary significantly due to different configurations or performance. It may include one or more microprocessors (central processing units, CPUs) 1122 (e.g., one or more processors) and memory 1132, and one or more storage media 1130 (e.g., one or more mass storage devices) for storing application programs 1142 or data 1144. The memory 1132 and storage media 1130 can be temporary or persistent storage. The program stored in the storage media 1130 may include one or more modules (not shown in the figure), each module may include a series of instruction operations on the server. Furthermore, the microprocessor 1122 may be configured to communicate with the storage media 1130 and execute the series of instruction operations in the storage media 1130 on the server 1100.
[0145] Server 1100 may also include one or more power supplies 1126, one or more wired or wireless network interfaces 1150, one or more input / output interfaces 1158, and / or one or more operating systems 1141, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, etc.
[0146] The steps performed by the server in the above embodiments can be based on this Figure 10 The structure of server 1100 shown. For example, as in the above embodiment, by Figure 7 The steps performed by the face adversarial example generation device shown can be based on this Figure 10 The server architecture is shown. For example, microprocessor 1122 performs the following operations by calling instructions in memory 1132:
[0147] The interference curve generator, the initial face image, and the reference face image are obtained through the input / output interface 1158.
[0148] The initial interference curve can be generated by the interference curve generator through the input / output interface 1158; the initial interference curve is used to add interference to the initial face image to obtain an adversarial face image; the interference curve generator is updated according to the loss value of the reference face image and the adversarial face image to obtain the updated interference curve generator; the updated interference curve generator is used to generate the target interference curve to obtain the physical world face adversarial sample.
[0149] In the above embodiments, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions in other embodiments.
[0150] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and modules described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0151] In the embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple modules or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces, indirect coupling or communication connection between apparatuses or modules, and may be electrical, mechanical, or other forms.
[0152] The modules described as separate components may or may not be physically separate. Similarly, the components shown as modules may or may not be physical modules; they may be located in one place or distributed across multiple network modules. Some or all of the modules can be selected to achieve the purpose of this embodiment, depending on actual needs.
[0153] This application also provides a computer-readable storage medium including instructions that, when run on a computer, cause the computer to perform the above-described method for generating adversarial examples of faces.
[0154] Furthermore, the functional modules in the various embodiments of this application can be integrated into one processing module, or each module can exist physically separately, or two or more modules can be integrated into one module. The integrated module can be implemented in hardware or as a software functional module. If the integrated module is implemented as a software functional module and sold or used as an independent product, it can be stored in a computer-readable storage medium.
[0155] This application also provides a chip, which includes a processor coupled to the transceiver of a terminal device, for executing the technical solutions provided in this application.
[0156] This application also provides a chip system, which includes a processor for supporting a terminal device in implementing the functions involved in the above-described method for generating adversarial examples of faces, such as generating or processing the information involved in the above-described method for generating adversarial examples of faces.
[0157] This application also provides a computer program product containing instructions, which, when executed on a computer or processor, cause the computer to perform the above-described method for generating adversarial examples of faces.
[0158] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product.
[0159] A computer program product includes one or more computer instructions. When a computer program is loaded and executed on a computer, it produces, in whole or in part, the flow or function according to the embodiments of this application. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, computer instructions may be transmitted from one website, computer, server, or data center to another via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be any available medium that a computer can store or a data storage device such as a server or data center that integrates one or more available media. The available medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., a solid-state drive (SSD)).
[0160] The technical solutions provided in the embodiments of this application have been described in detail above. Specific examples have been used in the embodiments of this application to illustrate the principles and implementation methods of the embodiments of this application. The description of the above embodiments is only for the purpose of helping to understand the methods and core ideas of the embodiments of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of the embodiments of this application. Therefore, the content of this specification should not be construed as a limitation on the embodiments of this application.
Claims
1. A method for generating a face adversarial sample, characterized in that, The method includes: Obtain the interference curve generator, the initial face image, and the reference face image; An initial interference curve is generated by the interference curve generator, wherein the initial interference curve includes an initial Bézier curve; By matching the control points of the initial Bézier curve with the pixels of the initial face image, or by converting the initial interference curve into an interference image of the same size as the initial face image, and by fusing the first image pixels of the interference image with the second image pixels of the initial face image pixel by pixel, an adversarial face image is obtained. Based on the loss values of the reference face image and the adversarial face image, the interference curve generator is updated to obtain an updated interference curve generator, wherein the loss values include a first loss value and a second loss value. The updated interference curve generator generates a target interference curve, which is then superimposed on the initial face image. Alternatively, a face image wearing a 3D interference curve mask is acquired to obtain a physical world face adversarial sample of the initial face image. The 3D interference curve mask is generated based on the 3D interference curve model corresponding to the target interference curve. The step of determining the first loss value includes: obtaining a first encoding vector of the reference face image, a first feature representation of the reference face image, a second encoding vector of the adversarial face image, and a second feature representation of the adversarial face image, wherein the first feature representation includes a vector representation obtained by any intermediate layer of the face recognition model performing abstract feature extraction on the reference face image, and the second feature representation includes a vector representation obtained by any intermediate layer of the face recognition model performing abstract feature extraction on the adversarial face image, wherein the intermediate layer refers to a hidden layer between the input layer and the output layer of the face recognition model; calculating the inner product of the first encoding vector and the second encoding vector to obtain an encoding cross value; calculating the squared difference of the first feature representation and the second feature representation to obtain a feature representation difference value; and performing a weighted summation of the encoding cross value and the feature representation difference value to obtain the first loss value. The step of determining the second loss value includes: obtaining a pre-trained covert discriminator, the pre-trained covert discriminator being trained from covert samples, the covert samples including training adversarial face images and covert scores of the training adversarial face images; performing covert prediction on the adversarial face images using the pre-trained covert discriminator to obtain the covert scores of the adversarial face images; and using the covert scores of the adversarial face images as the second loss value.
2. An apparatus for generating a face adversarial sample, comprising: The device includes: The input / output module is configured to acquire an interference curve generator, an initial face image, and a reference face image; The processing module is configured to: generate an initial interference curve using the interference curve generator, wherein the initial interference curve includes an initial Bézier curve; match the control points of the initial Bézier curve with the pixels of the initial face image, or convert the initial interference curve into an interference image of the same size as the initial face image, and perform pixel-by-pixel fusion of the first image pixels of the interference image and the second image pixels of the initial face image to obtain an adversarial face image; update the interference curve generator according to the loss value between the reference face image and the adversarial face image to obtain an updated interference curve generator, wherein the loss value includes a first loss value and a second loss value; generate a target interference curve using the updated interference curve generator, superimpose the target interference curve onto the initial face image, or acquire a face image wearing a 3D interference curve mask to obtain a physical world adversarial face sample of the initial face image, wherein the 3D interference curve mask is generated based on a 3D interference curve model corresponding to the target interference curve; The step of determining the first loss value includes: obtaining a first encoding vector of the reference face image, a first feature representation of the reference face image, a second encoding vector of the adversarial face image, and a second feature representation of the adversarial face image, wherein the first feature representation includes a vector representation obtained by any intermediate layer of the face recognition model performing abstract feature extraction on the reference face image, and the second feature representation includes a vector representation obtained by any intermediate layer of the face recognition model performing abstract feature extraction on the adversarial face image, wherein the intermediate layer refers to a hidden layer between the input layer and the output layer of the face recognition model; calculating the inner product of the first encoding vector and the second encoding vector to obtain an encoding cross value; calculating the squared difference of the first feature representation and the second feature representation to obtain a feature representation difference value; and performing a weighted summation of the encoding cross value and the feature representation difference value to obtain the first loss value. The step of determining the second loss value includes: obtaining a pre-trained covert discriminator, the pre-trained covert discriminator being trained from covert samples, the covert samples including training adversarial face images and covert scores of the training adversarial face images; performing covert prediction on the adversarial face images using the pre-trained covert discriminator to obtain the covert scores of the adversarial face images; and using the covert scores of the adversarial face images as the second loss value.
3. A computing device, comprising: It includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the method as described in claim 1.
4. A computer-readable storage medium, characterized in that, It includes instructions that, when run on a computer, cause the computer to perform the method as described in claim 1.
5. A computer program product comprising instructions, the computer program product including program instructions that, when executed on a computer or processor, cause the computer or processor to perform the method as claimed in claim 1.
6. A chip system, characterized in that, The chip system includes: A communication interface used for inputting and / or outputting information; A processor for executing a computer-executable program, causing a device having the chip system installed to perform the method as described in claim 1.
Citation Information
Patent Citations
Image target detection model attack method and device, terminal equipment and storage medium
CN112215227A
Face recognition attack sample generation method, model training method and related equipment
CN114241569A