Method and apparatus for authenticating an image

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using multiple target local feature recognition models in a deep learning network to identify local image features and calculate similarity, the problem of poor defense and low detection performance of existing adversarial defense methods is solved, achieving effective defense against adversarial examples and accurate detection of noise-free samples.

CN115775401BActive Publication Date: 2026-06-16JD DIGITS HAIYI INFORMATION TECHNOLOGY CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: JD DIGITS HAIYI INFORMATION TECHNOLOGY CO LTD
Filing Date: 2021-09-06
Publication Date: 2026-06-16

Application Information

Patent Timeline

06 Sep 2021

Application

16 Jun 2026

Publication

CN115775401B

IPC: G06V40/16; G06V10/82; G06N3/0464; G06N3/08

CPC: G06V40/16; G06V10/82; G06N3/08; G06N3/0464

AI Tagging

Application Domain

Character and pattern recognition Neural learning methods

Technical Efficacy Phrases

Improve defenseAvoid the problem of poor detection performance

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing adversarial defense methods suffer from poor defense performance and poor detection performance for noise-free samples in deep learning networks, especially in adversarial sample detection and data preprocessing.

⚗Method used

Multiple target local feature recognition models are used to identify local region features of the image to be verified and the benchmark image respectively, calculate feature similarity, and determine whether the image passes the verification based on the similarity threshold, thus avoiding preprocessing of sample data and adversarial training.

🎯Benefits of technology

It improves the system's defense performance against adversarial examples, avoids the decline in detection performance for noiseless samples, and does not rely on adversarial training methods, thus improving the accuracy and efficiency of verification images.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN115775401B_ABST

Patent Text Reader

Abstract

The application discloses a method and device for verifying images, and relates to the technical field of computers. The method comprises the following steps: acquiring a to-be-verified image, and identifying first local features of a plurality of local regions of the to-be-verified image by using a plurality of target local feature recognition models; acquiring a reference image, and identifying second local features of a plurality of local regions of the reference image by using the plurality of target local feature recognition models; for each of the plurality of target local feature recognition models, acquiring a feature similarity between the first local features identified by using the target local feature recognition model and the second local features identified by using the target local feature recognition model; and determining whether the to-be-verified image passes the verification according to the plurality of acquired feature similarities. The method can improve the defense performance of the system against adversarial samples and improve the accuracy of the verified images.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of computer technology, and more particularly to methods and apparatus for verifying images. Background Technology

[0002] With the rapid development of artificial intelligence technology, deep learning networks are widely used in image processing (such as image recognition and image conversion). When performing image processing tasks based on deep learning networks, adversarial defense methods are usually employed to prevent adversarial attacks on the deep learning network. Existing adversarial defense methods include: performing adversarial example detection on the input samples to the network, adversarially training the deep learning network, or preprocessing the samples.

[0003] However, adversarial defense methods for adversarial example detection suffer from poor defense capabilities. Adversarial training and sample data preprocessing can lead to poor detection performance of deep learning networks for noise-free samples. Summary of the Invention

[0004] This disclosure provides a method, apparatus, electronic device, and computer-readable storage medium for verifying images.

[0005] According to a first aspect of this disclosure, a method for verifying an image is provided, comprising: acquiring an image to be verified, and using multiple target local feature recognition models to identify first local features of multiple local regions of the image to be verified; acquiring a reference image, and using multiple target local feature recognition models to identify second local features of multiple local regions of the reference image; for each of the multiple target local feature recognition models, acquiring a feature similarity between the first local feature identified by the target local feature recognition model and the second local feature identified by the target local feature recognition model; and determining whether the image to be verified passes verification based on the acquired multiple feature similarities.

[0006] In some embodiments, acquiring an image to be verified and using multiple target local feature recognition models to identify first local features of multiple local regions of the image to be verified includes: acquiring a face image to be verified and dividing the face image to be verified into different face regions; for each face region in the face image to be verified, using a target local feature recognition model for recognizing features of face regions to identify the first local feature of that face region; acquiring a reference image and using multiple target local feature recognition models to identify second local features of multiple local regions of the reference image includes: acquiring a reference face image and dividing the reference face image into different face regions; for each face region in the reference face image, using a target local feature recognition model for recognizing features of face regions to identify the second local feature of that face region.

[0007] In some embodiments, determining whether an image to be verified passes verification based on multiple acquired feature similarities includes: in response to determining that at least one feature similarity among the multiple feature similarities satisfies a first similarity threshold, determining that the image to be verified passes verification.

[0008] In some embodiments, determining whether an image to be verified passes verification based on multiple acquired feature similarities includes: determining that the image to be verified passes verification in response to determining that each of the multiple feature similarities satisfies a second similarity threshold.

[0009] According to a second aspect of this disclosure, a method for verifying an image is provided, comprising: acquiring an image to be verified and identifying a first global feature of the image to be verified using a trained feature recognition model; acquiring a reference image and identifying a second global feature of the reference image using a trained feature recognition model; verifying the image to be verified using the method in the first aspect in response to determining that the similarity between the first global feature and the second global feature meets a third similarity threshold; or, determining that the image to be verified fails verification in response to determining that the similarity between the first global feature and the second global feature does not meet the third similarity threshold.

[0010] According to a third aspect of this disclosure, a method for training a model is provided, comprising: acquiring at least one piece of sample data, the sample data including a sample image and labels of local images of various local regions in the sample image; acquiring initial local feature recognition models for recognizing features of various local regions; for each local image in the local images of various regions, inputting the local image into the initial local feature recognition model for recognizing features of the local region to which the local image belongs, and obtaining local features output by the initial local feature recognition model; acquiring a loss between the label of the local image and the label represented by the local features; training multiple initial local feature recognition models based on the average of the acquired multiple losses, and obtaining multiple target local feature recognition models, wherein the target local feature recognition models are applied to the method of the first aspect or the second aspect.

[0011] According to a fourth aspect of this disclosure, an apparatus for verifying an image is provided, comprising: a first recognition unit configured to acquire an image to be verified and to recognize first local features of multiple local regions of the image to be verified using multiple target local feature recognition models; a second recognition unit configured to acquire a reference image and to recognize second local features of multiple local regions of the reference image using multiple target local feature recognition models; a matching unit configured to, for each of the multiple target local feature recognition models, acquire feature similarity between the first local feature recognized by the target local feature recognition model and the second local feature recognized by the target local feature recognition model; and a verification unit configured to determine whether the image to be verified passes verification based on the acquired multiple feature similarities.

[0012] In some embodiments, the first recognition unit includes: a first segmentation module configured to acquire a face image to be verified and segment the face image to be verified into different face regions; and a first recognition module configured to, for each face region in the face image to be verified, use a target local feature recognition model for recognizing features of the face region to recognize a first local feature of the face region; the second recognition unit includes: a second segmentation module configured to acquire a reference face image and segment the reference face image into different face regions; and a second recognition module configured to, for each face region in the reference face image, use a target local feature recognition model for recognizing features of the face region to recognize a second local feature of the face region.

[0013] In some embodiments, the verification unit includes: a first verification module configured to determine that the image to be verified passes verification in response to determining that at least one feature similarity among a plurality of feature similarities satisfies a first similarity threshold.

[0014] In some embodiments, the verification unit includes a second verification module configured to determine that the image to be verified passes verification in response to determining that each of a plurality of feature similarities satisfies a second similarity threshold.

[0015] According to a fifth aspect of this disclosure, an apparatus for verifying an image is provided, comprising: a third recognition unit configured to acquire an image to be verified and to recognize a first global feature of the image to be verified using a trained feature recognition model; a fourth recognition unit configured to acquire a reference image and to recognize a second global feature of the reference image using a trained feature recognition model; a first verification unit configured to verify the image to be verified using the method of the first aspect in response to determining that the similarity between the first global feature and the second global feature meets a third similarity threshold; or, a second verification unit configured to determine that the image to be verified fails verification in response to determining that the similarity between the first global feature and the second global feature does not meet the third similarity threshold.

[0016] According to a sixth aspect of this disclosure, an apparatus for training a model is provided, comprising: a first acquisition unit configured to acquire at least one piece of sample data, the sample data including a sample image and labels of local images of various local regions in the sample image; a second acquisition unit configured to acquire initial local feature recognition models for recognizing features of various local regions; a prediction unit configured to, for each local image in the local images of various regions, input the local image into the initial local feature recognition model for recognizing features of the local region to which the local image belongs, and obtain local features output by the initial local feature recognition model; a calculation unit configured to acquire a loss between the label of the local image and the label represented by the local features; and a training unit configured to train multiple initial local feature recognition models based on the average of the acquired multiple losses, and obtain multiple target local feature recognition models, wherein the target local feature recognition models are applied to the method of the first aspect or the second aspect.

[0017] According to a seventh aspect of this disclosure, embodiments of this disclosure provide an electronic device, including: one or more processors; and a storage device for storing one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors implement the method for verifying an image as provided in the first aspect, or the method for verifying an image as provided in the second aspect, or the method for training a model as provided in the third aspect.

[0018] According to the eighth aspect of this disclosure, embodiments of this disclosure provide a computer-readable storage medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the method for verifying an image as provided in the first aspect, or the method for verifying an image as provided in the second aspect, or the method for training a model as provided in the third aspect.

[0019] The method and apparatus for verifying images disclosed herein include: acquiring an image to be verified and using multiple target local feature recognition models to identify first local features of multiple local regions of the image to be verified; acquiring a reference image and using multiple target local feature recognition models to identify second local features of multiple local regions of the reference image; for each of the multiple target local feature recognition models, acquiring the feature similarity between the first local feature identified by the target local feature recognition model and the second local feature identified by the target local feature recognition model; and determining whether the image to be verified passes verification based on the acquired multiple feature similarities, which can improve the system's defense performance against adversarial examples. Furthermore, this method verifies images based on the feature similarity of local regions, eliminating the need for preprocessing of sample data and avoiding the problem of poor detection performance for noise-free samples caused by preprocessing both clean samples (noise-free samples) and adversarial samples (noisy samples). Also, this method does not verify images based on networks obtained through adversarial training methods, thus avoiding the problem of poor performance of networks obtained through adversarial training methods when detecting noise-free samples.

[0020] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of this disclosure, nor is it intended to limit the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description

[0021] The accompanying drawings are provided for a better understanding of this solution and do not constitute a limitation of this application. Wherein:

[0022] Figure 1 This is an exemplary system architecture diagram in which embodiments of this application can be applied;

[0023] Figure 2 This is a flowchart of one embodiment of the method for verifying images according to this application;

[0024] Figure 3 This is a flowchart of another embodiment of the method for verifying images according to this application;

[0025] Figure 4 This is a flowchart of one embodiment of the method for verifying images according to this application;

[0026] Figure 5 This is a flowchart of global feature recognition in an application scenario of the method for verifying images according to this application;

[0027] Figure 6 This is a flowchart of local feature recognition in an application scenario of the method for verifying images according to this application;

[0028] Figure 7 This is a flowchart of one embodiment of the method for training images according to this application;

[0029] Figure 8 This is a schematic diagram of one embodiment of the apparatus for verifying images according to this application;

[0030] Figure 9 This is a schematic diagram of one embodiment of the apparatus for verifying images according to this application;

[0031] Figure 10 This is a schematic diagram of one embodiment of the apparatus for training a model according to this application;

[0032] Figure 11 This is a block diagram of an electronic device used to implement the image verification method of the embodiments of this application. Detailed Implementation

[0033] The following description, in conjunction with the accompanying drawings, illustrates exemplary embodiments of this application, including various details to aid understanding. These should be considered merely exemplary. Therefore, those skilled in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of this application. Similarly, for clarity and brevity, descriptions of well-known functions and structures are omitted in the following description.

[0034] It should be noted that all personal information data involved in this application embodiment has been voluntarily authorized by the user, and the acquisition, storage, processing and transmission of personal information comply with the requirements of relevant laws and regulations.

[0035] Figure 1 An exemplary system architecture 100 is shown, in which embodiments of the method or apparatus for verifying images of this application may be applied.

[0036] like Figure 1 As shown, system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105. Network 104 serves as the medium for providing communication links between terminal devices 101, 102, and 103 and server 105. Network 104 may include various connection types, such as wired or wireless communication links, or fiber optic cables, etc.

[0037] Users can use terminal devices 101, 102, and 103 to interact with server 105 via network 104 to receive or send messages, etc. Terminal devices 101, 102, and 103 can be user terminal devices, on which various client applications can be installed, such as image recognition applications, video recognition applications, playback applications, search applications, financial applications, etc.

[0038] Terminal devices 101, 102, and 103 can be various electronic devices with displays that support receiving messages from the server, including but not limited to smartphones, tablets, e-book readers, electronic players, laptops, and desktop computers.

[0039] Terminal devices 101, 102, and 103 can be either hardware or software. When terminal devices 101, 102, and 103 are hardware, they can be various electronic devices. When terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. They can be implemented as multiple software programs or software modules (e.g., multiple software modules used to provide distributed services) or as a single software program or software module. No specific limitations are made here.

[0040] Server 105 can acquire the image to be verified through terminal devices 101, 102, and 103, and use multiple target local feature recognition models to identify the first local features of multiple local regions of the image to be verified, and acquire a reference image, and use multiple target local feature recognition models to identify the second local features of multiple local regions of the reference image. For each local feature recognition model among the multiple target local features, the server acquires the feature similarity between the first local feature determined by the local feature recognition model and the second local feature determined by the local feature recognition model. Then, based on the acquired multiple feature similarities, the server determines whether the image to be verified passes the verification.

[0041] It should be noted that the business processing method provided in the embodiments of this disclosure can be executed by terminal devices 101, 102, and 103 or by server 105. Accordingly, the business processing device can be located in terminal devices 101, 102, and 103 or in server 105.

[0042] It should be understood that Figure 1 The number of terminal devices, networks, and servers shown is merely illustrative. Depending on implementation needs, any number of terminal devices, networks, and servers can be included.

[0043] Continue to refer to Figure 2 The diagram illustrates a flow 200 of an embodiment of a method for verifying images according to the present disclosure, comprising the following steps:

[0044] Step 201: Obtain the image to be verified, and use multiple target local feature recognition models to identify the first local features of multiple local regions of the image to be verified.

[0045] In this embodiment, the execution entity of the method for verifying images (e.g. Figure 1The server 105 shown can acquire the image to be verified and use multiple target local feature recognition models to identify the first local features of multiple local regions of the image to be verified. That is, multiple target local feature recognition models are acquired, and each of these models is used to identify different local regions of the image to be verified, so as to obtain the first local feature of each local area in different local regions. The image to be verified can be a face image, or an image containing any target object (such as animals, plants, landscapes, drawings, various items, etc.), and the target local feature recognition model can be a deep learning model, linear regression model, etc., trained based on the Internet, local storage, or cloud storage.

[0046] Step 202: Obtain the reference image and use multiple target local feature recognition models to identify the second local features of multiple local regions of the reference image.

[0047] In this embodiment, a reference image can be acquired, and multiple target local feature recognition models can be used to identify the second local features of multiple local regions of the reference image. That is, multiple target local feature recognition models are acquired, and each of these models is used to identify different local regions of the reference image to obtain the second local feature of each local area within those regions. The reference image serves as the benchmark for comparing the image to be verified. For example, if the image to be verified is a face image used to verify whether a current user has permission to log in to or operate an account, the reference image could be the registered face image used during account registration.

[0048] Step 203: For each of the multiple target local feature recognition models, obtain the feature similarity between the first local feature determined by the target local feature recognition model and the second local feature determined by the target local feature recognition model.

[0049] In this embodiment, for each of the multiple target local feature recognition models, a first local feature determined by that target local feature recognition model and a second local feature determined by that target local feature recognition model can be obtained, and the feature similarity between the first local feature and the second local feature obtained based on the target local feature recognition model can be calculated. It can be understood that multiple sets of first local features and second local features can be obtained based on multiple target local feature recognition models, and multiple feature similarities can be calculated.

[0050] Step 204: Based on the obtained similarity of multiple features, determine whether the image to be verified passes the verification.

[0051] In this embodiment, the verification status of an image to be verified can be determined based on the acquired similarity of multiple features. Specifically, the verification status can be determined based on whether the average of all feature similarities exceeds a preset threshold, whether the median of all feature similarities exceeds a preset threshold, or other statistical features of all feature similarities.

[0052] The method for verifying images provided in this embodiment acquires an image to be verified and uses multiple target local feature recognition models to identify first local features of multiple local regions in the image to be verified; acquires a benchmark image and uses multiple target local feature recognition models to identify second local features of multiple local regions in the benchmark image; for each of the multiple target local feature recognition models, it acquires the feature similarity between the first local feature identified by the target local feature recognition model and the second local feature identified by the target local feature recognition model; based on the acquired multiple feature similarities, it determines whether the image to be verified passes verification, which can improve the system's defense performance against adversarial examples. Furthermore, this method verifies images based on the feature similarity of local regions, eliminating the need for preprocessing of sample data and avoiding the problem of poor network detection performance for noiseless samples caused by preprocessing both clean samples (noise-free samples) and adversarial samples (noisy samples). Also, this method does not verify images based on networks obtained through adversarial training methods, thus avoiding the problem of poor performance of networks obtained through adversarial training methods when detecting noiseless samples.

[0053] Continue to refer to Figure 3 The diagram illustrates a flow 300 of another embodiment of a method for verifying images according to the present disclosure, comprising the following steps:

[0054] Step 301: Obtain the face image to be verified and divide the face image to be verified into different face regions.

[0055] In this embodiment, the execution entity of the method for verifying images (e.g. Figure 1 The server 105 shown can acquire the face image to be verified and divide the face image into different face regions, such as the forehead region, eye region, cheek regions, mouth region, etc. The face regions can be divided using a pre-trained facial region segmentation model, or based on the proportion data or location information of each pre-set face region.

[0056] Step 302: For each face region in the face image to be verified, the first local feature of the face region is identified using a target local feature recognition model for identifying the features of that face region.

[0057] In this embodiment, for each facial region in the face image to be verified, a target local feature recognition model for identifying the features of that facial region is used to identify the first local feature of that facial region. For example, for the eye region in the face image to be verified, a target local feature recognition model for identifying the features of the eye region is used to identify the eye region image of the face image to be verified, and the features of the eye region of the face image to be verified are obtained. The features obtained based on the face image to be verified can be called the first local feature.

[0058] Step 303: Obtain a reference face image and divide the reference face image into different face regions.

[0059] In this embodiment, a reference face image can be acquired and divided into different face regions, such as the forehead region, eye region, cheek regions, mouth region, etc. The face regions can be divided using a pre-trained facial region segmentation model, or based on preset proportional data or location information of each face region.

[0060] Step 304: For each face region in the reference face image, a target local feature recognition model for recognizing the features of that face region is used to identify the second local feature of that face region.

[0061] In this embodiment, for each face region in the reference face image, a target local feature recognition model for identifying the features of that face region is used to identify the second local feature of that face region. For example, for the eye region in the reference face image, a target local feature recognition model for identifying the features of the eye region is used to identify the eye region image of the reference face image, and the features of the eye region of the reference face image are obtained. The features obtained based on the reference face image can be called the second local feature.

[0062] Step 305: For each of the multiple target local feature recognition models, obtain the feature similarity between the first local feature recognized by the target local feature recognition model and the second local feature recognized by the target local feature recognition model.

[0063] Step 306: Based on the obtained similarity of multiple features, determine whether the image to be verified passes the verification.

[0064] In this embodiment, the descriptions of steps 305 and 306 are consistent with those of steps 203 and 204, and will not be repeated here.

[0065] The method for verifying images provided in this embodiment is compared to... Figure 2In the described embodiment, the image being verified is a face image. When verifying a face image, the face image can be divided into different local regions based on the face region. The face image to be verified is verified based on the feature similarity between each face region of the face image to be verified and each face region of the reference face image.

[0066] Because adversarial examples alter pixel values compared to the original images, and the perturbations added in adversarial examples often do not act on individual pixels but rather exhibit continuity across pixels at different locations, with the perturbation values at different locations exhibiting dependencies, a target local feature recognition model can disrupt this continuity and dependency of the adversarial perturbations, rendering the adversarial attack ineffective. Furthermore, this method has minimal impact on the pass rate of real faces in face recognition systems and does not affect the network's recognition / verification performance on clean samples.

[0067] In the above combination Figure 2 and Figure 3 In some optional implementations of the described embodiments, determining whether an image to be verified passes verification based on multiple acquired feature similarities includes: in response to determining that at least one feature similarity among the multiple feature similarities satisfies a first similarity threshold, determining that the image to be verified passes verification.

[0068] In this embodiment, when determining whether an image to be verified passes verification based on the acquired multiple feature similarities, if any feature similarity satisfies a preset first similarity threshold among the multiple feature similarities, then the image to be verified is determined to pass verification, thereby improving the efficiency of image verification.

[0069] In the above combination Figure 2 and Figure 3 In some optional implementations of the described embodiments, determining whether an image to be verified passes verification based on multiple acquired feature similarities includes: determining that the image to be verified passes verification in response to determining that each of the multiple feature similarities satisfies a second similarity threshold.

[0070] In this embodiment, when determining whether an image to be verified passes verification based on multiple acquired feature similarities, if each feature similarity among the multiple feature similarities is determined to meet a preset second similarity threshold, then the image to be verified is determined to pass verification, thereby improving the accuracy of the verified image.

[0071] Continue to refer to Figure 4 The diagram illustrates a flow 400 of an embodiment of a method for verifying images according to the present disclosure, comprising the following steps:

[0072] Step 401: Obtain the image to be verified and use the trained feature recognition model to identify the first global features of the image to be verified.

[0073] In this embodiment, the execution entity of the method for verifying images (e.g. Figure 1 The server 105 shown can acquire the image to be verified and use a trained feature recognition model to identify the first global features of the image. This trained feature recognition model performs feature recognition based on the global / total image region, i.e., the image without region segmentation, to obtain the global features of the image. Global features are features relative to local features. For ease of distinction, the global features identified based on the image to be verified can be called the first global features.

[0074] Step 402: Obtain the reference image and use the trained feature recognition model to identify the second global features of the reference image.

[0075] In this embodiment, a reference image can be acquired, and a trained feature recognition model can be used to identify the second global features of the reference image. For ease of distinction, the global features identified based on the reference image can be referred to as the second global features.

[0076] Step 4031, in response to determining that the similarity between the first global feature and the second global feature meets the third similarity threshold, adopt... Figure 2 or Figure 3 The methods described in the embodiments are used to verify the image to be verified.

[0077] In this embodiment, if the similarity between the first global feature and the second global feature meets a preset third similarity threshold, then further steps can be taken. Figure 2 or Figure 3 The method described in the embodiments divides the image to be verified into regions with a reference image, and then performs verification again based on the similarity between the regional features of each region after region division, thereby improving the accuracy of verifying the image to be verified.

[0078] Step 4032: In response to determining that the similarity between the first global feature and the second global feature does not meet the third similarity threshold, the image to be verified is determined to have failed verification.

[0079] In this embodiment, if it is determined that the similarity between the first global feature and the second global feature does not meet the preset third similarity threshold, it can be determined that the image to be verified is not similar to the reference image, and the image to be verified fails the verification.

[0080] The method for verifying images provided in this embodiment is compared to... Figure 2 or Figure 3 The method in the described embodiments, after comparing the similarity between the image to be verified and the reference image based on global image features and determining that the global features of the image to be verified and the reference image are similar, then further... Figure 2 or Figure 3 The method described in the embodiments compares the similarity between the image to be verified and the reference image based on local features of the image. This allows the comparison of the similarity between the image to be verified and the reference image based on local features to be used only for the image to be verified that has already passed global feature verification, rather than for all the images to be verified. This improves the accuracy of the verification images, increases the efficiency of the verification images, and avoids the problem of wasting server resources on the calculation and storage of a large number of local features.

[0081] In some application scenarios, such as Figure 5 As shown, the method for verifying images can be applied to a face recognition system, which can acquire the face image to be verified (the image to be verified) input by the user, and use a global feature recognition model to extract the global features of the face image to be verified.

[0082] The face recognition system obtains the registered face image (baseline image) of the user based on local / cloud storage, and extracts the global features of the registered face image using a global feature recognition model.

[0083] The global features of the face image to be verified are compared with the global features of the registered face image. If they are determined to be dissimilar, the face image to be verified fails verification; if they are determined to be similar, then... Figure 6 The method shown is used to further verify the face image to be verified.

[0084] exist Figure 6 In the method for face images shown, the face image to be verified is divided into different face regions. For each face region, a target local feature recognition model is used to identify the first local feature of the face region.

[0085] The registered face image is divided into different face regions. For each face region, a target local feature recognition model is used to identify the second local feature of that face region. Figure 6 Local models 1 to n in the model are n target local feature recognition models used to identify the features of each local region.

[0086] The similarity between the first local feature and the second local feature identified by the same target local feature recognition model is compared, and a similarity sequence [S1, S2, ..., S] is obtained. n ], each similarity S in the sequence i(1≤i≤n) represents the similarity comparison result between the local features extracted by the target local feature recognition model from a certain face region in the face image to be verified and the local features extracted by the target local feature recognition model from the same face region in the registered face image.

[0087] Finally, based on all the comparison results, it is determined whether the image to be verified passes the verification. Specifically, if the similarity sequence [S1, S2, ..., S...]... n Any similarity S in ] i If the value of satisfies the similarity threshold, then the face recognition system can determine that the face image to be verified has passed verification, or that the face image to be verified is not an adversarial example used to attack the face recognition system; specifically, if the similarity sequence [S1, S2, ..., S...] satisfies the similarity threshold, then the face recognition system can determine that the face image to be verified has passed verification, or that the face image to be verified is not an adversarial example used to attack the face recognition system; n The total similarity S in [the text] i If all values of satisfy the similarity threshold, the face recognition system can determine that the face image to be verified has passed verification, or that the face image to be verified is not an adversarial example used to attack the face recognition system.

[0088] Continue to refer to Figure 7 The diagram illustrates a flow 700 of an embodiment of a method for training a model according to the present disclosure, comprising the following steps:

[0089] Step 701: Obtain at least one sample data point, which includes a sample image and the labels of local images of each local region in the sample image.

[0090] In this embodiment, the execution entity of the method used to train the model (e.g. Figure 1 The server 105 shown can acquire at least one sample data point through terminal devices, cloud storage, or local storage. A sample data point may include a sample image and labels for local images of various regions within that sample image. For example, a sample data point may include a face image, parameters such as the size and pixel features of the forehead region, the size and interpupillary distance of the eye region, and the size of the mouth region.

[0091] Step 702: Obtain initial local feature recognition models for recognizing features of each local region.

[0092] In this embodiment, initial local feature recognition models can be obtained for identifying features of various local regions in a sample image. Different initial local feature recognition models are used to identify features of different local regions in the sample image. Each initial local feature recognition model can be any type of deep learning model.

[0093] Step 703: For each local image in the local images of each region, input the local image into an initial local feature recognition model for identifying the features of the local region to which the local image belongs, and obtain the local features output by the initial local feature recognition model.

[0094] In this embodiment, for each local image in each local region, the local image is input into an initial local feature recognition model used to identify the features of the local region to which the local image belongs, to obtain the local features output by the initial local feature recognition model. For example, for the eye image, mouth image, and cheek image in the sample face image, the eye image is input into an initial local feature recognition model A used to identify the features of the eye region, to obtain the eye features output by the initial local feature recognition model A; the mouth image is input into an initial local feature recognition model B used to identify the features of the mouth region, to obtain the mouth features output by the initial local feature recognition model B; and the cheek image is input into an initial local feature recognition model C used to identify the features of the face region, to obtain the cheek features output by the initial local feature recognition model C.

[0095] Step 704: Obtain the loss between the labels of the local image and the labels represented by the local features.

[0096] In this embodiment, the loss between the label of a local image and the label represented by the local features can be obtained. For example, for an eye image, the loss between the label of the eye image in the sample data and the label represented by the eye features identified by the initial local feature model A in step 703 can be obtained. The label represented by the eye features can be a size parameter describing the eye feature (such as interpupillary distance) or information describing the shape of the eye (such as almond-shaped eyes).

[0097] Step 705: Based on the average of the obtained multiple losses, train multiple initial local feature recognition models and obtain multiple target local feature recognition models. The target local feature recognition models are applied to... Figure 2 , Figure 3 or Figure 4 The method for verifying an image in the embodiments is described.

[0098] In this embodiment, since the sample image contains local images belonging to various local regions, after using the corresponding initial local feature recognition model to identify the local features of each local image, and calculating the loss based on the label of each local image and the label represented by each local feature, multiple losses can be obtained.

[0099] Multiple initial local feature recognition models can be trained based on the average of the obtained losses. For example, the following loss function can be used as the loss function for training multiple initial local feature recognition models:

[0100]

[0101] Where i represents the identifier of the initial local feature recognition model, the identifier of the target local feature recognition model, and the identifier of the local image of each region in the sample image (1≤i≤N), and N represents the total number of initial local feature recognition models.

[0102] L represents the local image x i The local features F extracted after feature extraction by the initial local feature recognition model i i Compared with the local image x in the sample data i The tag y i The losses between them.

[0103] It is the model parameter W i With local features F i The angle formed by the model parameters W represents the parameters of each layer of the local feature recognition model. Models with different W values have different features extraction capabilities for the input image. The angle between W and F represents the features F extracted from the input image by the model. i With W i The smaller the angle formed, the closer the two are, and the greater the probability that the image should be identified as the i-th label.

[0104] m is the preset angle margin. Setting m can... Compare Having a larger angle can constrain the model parameters.

[0105] e is the base of the natural logarithm; s is the scaling factor; j represents the counting identifier, which has the same range of values as i.

[0106] When training each initial local feature recognition model using the above loss function, the training objective is to minimize the loss function, and the model parameters W are gradually optimized through iterative training operations.

[0107] The method for training a model provided in this embodiment involves acquiring at least one piece of sample data, including a sample image and labels for local images of various local regions within the sample image; acquiring initial local feature recognition models for identifying features of each local region; for each local image in each region, inputting the local image into the initial local feature recognition model for identifying features of the local region to which the local image belongs, and obtaining the local features output by the initial local feature recognition model; acquiring the loss between the label of the local image and the label represented by the local features; and training multiple initial local feature recognition models based on the average of the acquired multiple losses, thereby obtaining multiple target local feature recognition models. The method can determine whether an input image is an adversarial example by comparing the similarity of different local regions in the sample image.

[0108] Because adversarial examples alter pixel values compared to the original images, and the perturbations added in adversarial examples often do not act on individual pixels but rather exhibit continuity across pixels at different locations, with dependencies between perturbation values, a trained target local feature recognition model can disrupt this continuity and dependency of the adversarial perturbations, identifying the adversarial example image and rendering the adversarial attack ineffective. Furthermore, compared to adversarially training the model to defend against adversarial examples, this method avoids overfitting, which can negatively impact the recognition performance of clean samples. Finally, compared to preprocessing input data (such as image compression) before inputting it into the model for sample image detection to defend against adversarial examples, this method avoids preprocessing indiscriminately clean samples, thus preventing issues with the model's recognition performance on clean samples.

[0109] Further reference Figure 8 As an implementation of the methods shown in the above figures, this disclosure provides an embodiment of an apparatus for verifying images, which is similar to... Figure 2 and Figure 3 Corresponding to the method embodiments shown, this device can be specifically applied to various electronic devices.

[0110] like Figure 8As shown, the image verification device of this embodiment includes: a first recognition unit 801, a second recognition unit 802, a matching unit 803, and a verification unit 804. The first recognition unit is configured to acquire an image to be verified and use multiple target local feature recognition models to identify first local features of multiple local regions of the image to be verified, respectively. The second recognition unit is configured to acquire a reference image and use multiple target local feature recognition models to identify second local features of multiple local regions of the reference image, respectively. The matching unit is configured to, for each of the multiple target local feature recognition models, acquire the feature similarity between the first local feature identified by the target local feature recognition model and the second local feature identified by the target local feature recognition model. The verification unit is configured to determine whether the image to be verified passes verification based on the acquired multiple feature similarities.

[0111] In some embodiments, the first recognition unit includes: a first segmentation module configured to acquire a face image to be verified and segment the face image to be verified into different face regions; and a first recognition module configured to, for each face region in the face image to be verified, use a target local feature recognition model for recognizing features of the face region to recognize a first local feature of the face region; the second recognition unit includes: a second segmentation module configured to acquire a reference face image and segment the reference face image into different face regions; and a second recognition module configured to, for each face region in the reference face image, use a target local feature recognition model for recognizing features of the face region to recognize a second local feature of the face region.

[0112] In some embodiments, the verification unit includes: a first verification module configured to determine that the image to be verified passes verification in response to determining that at least one feature similarity among a plurality of feature similarities satisfies a first similarity threshold.

[0113] In some embodiments, the verification unit includes a second verification module configured to determine that the image to be verified passes verification in response to determining that each of a plurality of feature similarities satisfies a second similarity threshold.

[0114] Each unit in the aforementioned device 800 and the reference Figure 2 and Figure 3 The steps described in the method correspond to those steps. Therefore, the operations, features, and technical effects described above for the method used to verify images also apply to the device 800 and the units contained therein, and will not be repeated here.

[0115] Further reference Figure 9As an implementation of the methods shown in the above figures, this disclosure provides an embodiment of an apparatus for verifying images, which is similar to... Figure 4 Corresponding to the method embodiments shown, this device can be specifically applied to various electronic devices.

[0116] like Figure 9 As shown, the image verification device of this embodiment includes: a third recognition unit 901, a fourth recognition unit 902, and a first verification unit 9031 or a second verification unit 9032. The third recognition unit is configured to acquire the image to be verified and use a trained feature recognition model to identify a first global feature of the image to be verified; the fourth recognition unit is configured to acquire a reference image and use a trained feature recognition model to identify a second global feature of the reference image; the first verification unit is configured to, in response to determining that the similarity between the first global feature and the second global feature meets a third similarity threshold, use... Figure 2 or Figure 3 The method in the described embodiments verifies the image to be verified; or, the second verification unit is configured to determine that the image to be verified has failed verification in response to determining that the similarity between the first global feature and the second global feature does not meet a third similarity threshold.

[0117] Each unit in the aforementioned device 900 and the reference Figure 4 The steps described in the method correspond to those steps. Therefore, the operations, features, and technical effects described above for the method used to verify images also apply to the device 900 and the units contained therein, and will not be repeated here.

[0118] Further reference Figure 10 As an implementation of the methods shown in the above figures, this disclosure provides an embodiment of an apparatus for training a model, which is similar to... Figure 7 Corresponding to the method embodiments shown, this device can be specifically applied to various electronic devices.

[0119] like Figure 10As shown, the apparatus for training a model in this embodiment includes: a first acquisition unit 1001, a second acquisition unit 1002, a prediction unit 1003, a calculation unit 1004, and a training unit 1005. The first acquisition unit is configured to acquire at least one piece of sample data, including a sample image and labels for local images of various local regions within the sample image; the second acquisition unit is configured to acquire initial local feature recognition models for identifying features of each local region; the prediction unit is configured to input the local image into the initial local feature recognition model for identifying features of the local region to which the local image belongs, and obtain the local features output by the initial local feature recognition model, for each local image of each region; the calculation unit is configured to acquire the loss between the label of the local image and the label represented by the local features; the training unit is configured to train multiple initial local feature recognition models based on the average of the acquired multiple losses, and obtain multiple target local feature recognition models, wherein the target local feature recognition models are applied to... Figure 2 , Figure 3 or Figure 4 The method for verifying an image in the embodiments is described.

[0120] Each unit in the aforementioned device 1000 and the reference Figure 7 The steps described in the method correspond to those steps. Therefore, the operations, features, and technical effects described above for the method used to train the model also apply to the device 1000 and the units contained therein, and will not be repeated here.

[0121] According to embodiments of this application, this application also provides an electronic device and a readable storage medium.

[0122] like Figure 11 The diagram shown is a block diagram of an electronic device 1100 for a method of verifying an image according to an embodiment of this application. The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely examples and are not intended to limit the implementation of the present application described and / or claimed herein.

[0123] like Figure 11As shown, the electronic device includes one or more processors 1101, a memory 1102, and interfaces for connecting the components, including high-speed interfaces and low-speed interfaces. The components are interconnected via different buses and can be mounted on a common motherboard or otherwise as required. The processors can process instructions executed within the electronic device, including instructions stored in or on memory to display graphical information of a GUI on an external input / output device (such as a display device coupled to the interface). In other embodiments, multiple processors and / or multiple buses can be used with multiple memories and multiple memory modules, if desired. Similarly, multiple electronic devices can be connected, each providing some of the necessary operations (e.g., as a server array, a group of blade servers, or a multiprocessor system). Figure 11 Take a processor 1101 as an example.

[0124] The memory 1102 is the non-transitory computer-readable storage medium provided in this application. The memory stores instructions executable by at least one processor to cause the at least one processor to perform the image verification method provided in this application. The non-transitory computer-readable storage medium of this application stores computer instructions for causing a computer to perform the image verification method provided in this application.

[0125] Memory 1102, as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as the program instructions / modules corresponding to the image verification method in the embodiments of this application (e.g., attached). Figure 8 The first identification unit 801, the second identification unit 802, the matching unit 803, and the verification unit 804 are shown. The processor 1101 executes various functional applications and data processing of the server by running non-transient software programs, instructions, and modules stored in the memory 1102, that is, it implements the method for verifying images in the above method embodiments.

[0126] Memory 1102 may include a program storage area and a data storage area. The program storage area may store the operating system and applications required for at least one function; the data storage area may store data created based on the use of the electronic device for extracting video clips. Furthermore, memory 1102 may include high-speed random access memory and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 1102 may optionally include memory remotely located relative to processor 1101, and this remote memory may be connected to the electronic device for extracting video clips via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

[0127] The electronic device for the method of verifying images may further include: an input device 1103, an output device 1104, and a bus 1105. The processor 1101, memory 1102, input device 1103, and output device 1104 may be connected via the bus 1105 or other means. Figure 11 Taking the connection between China and Israel via bus 1105 as an example.

[0128] Input device 1103 can receive input digital or character information, as well as generate key signal inputs related to user settings and function control of the electronic device used for extracting video clips, such as touch screens, keypads, mice, trackpads, touchpads, joysticks, one or more mouse buttons, trackballs, joysticks, etc. Output device 1104 may include display devices, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors). The display device may include, but is not limited to, liquid crystal displays (LCDs), light-emitting diode (LED) displays, and plasma displays. In some embodiments, the display device may be a touch screen.

[0129] Various implementations of the systems and techniques described herein can be implemented in digital electronic circuit systems, integrated circuit systems, application-specific integrated circuits (ASICs), computer hardware, firmware, software, and / or combinations thereof. These various implementations may include: implementations in one or more computer programs that can be executed and / or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general-purpose programmable processor, capable of receiving data and instructions from a storage system, at least one input device, and at least one output device, and transferring data and instructions to the storage system, the at least one input device, and the at least one output device.

[0130] These computational programs (also referred to as programs, software, software applications, or code) include machine instructions for a programmable processor and can be implemented using high-level procedural and / or object-oriented programming languages, and / or assembly / machine languages. As used herein, the terms “machine-readable medium” and “computer-readable medium” refer to any computer program product, device, and / or apparatus (e.g., disk, optical disk, memory, programmable logic device (PLD)) used to provide machine instructions and / or data to a programmable processor, including machine-readable media that receive machine instructions as machine-readable signals. The term “machine-readable signal” refers to any signal used to provide machine instructions and / or data to a programmable processor.

[0131] To provide interaction with a user, the systems and techniques described herein can be implemented on a computer having: a display device for displaying information to the user (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor); and a keyboard and pointing device (e.g., a mouse or trackball) through which the user provides input to the computer. Other types of devices can also be used to provide interaction with the user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form (including sound input, voice input, or tactile input).

[0132] The systems and technologies described herein can be implemented in computing systems that include backend components (e.g., as a data server), or computing systems that include middleware components (e.g., an application server), or computing systems that include frontend components (e.g., a user computer with a graphical user interface or web browser through which a user can interact with embodiments of the systems and technologies described herein), or any combination of such backend, middleware, or frontend components. The components of the system can be interconnected via digital data communication of any form or medium (e.g., a communication network). Examples of communication networks include local area networks (LANs), wide area networks (WANs), and the Internet.

[0133] Computer systems can include clients and servers. Clients and servers are generally located far apart and typically interact through communication networks. Client-server relationships are created by computer programs running on the respective computers and having a client-server relationship with each other.

[0134] It should be understood that the various forms of processes shown above can be used to rearrange, add, or delete steps. For example, the steps described in this application can be executed in parallel, sequentially, or in different orders, as long as the desired result of the technical solution disclosed in this application can be achieved, and this is not limited herein.

[0135] The specific embodiments described above do not constitute a limitation on the scope of protection of this application. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can be made according to design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this application should be included within the scope of protection of this application.

Claims

1. A method for verifying an image, comprising: The process involves acquiring an image to be verified and using multiple target local feature recognition models to identify the first local features of multiple local regions in the image to be verified, including: acquiring a face image to be verified; dividing the face region using a pre-trained face region segmentation model, or dividing the face region based on preset proportional data or region location information of each face region; and for each face region in the face image to be verified, using a target local feature recognition model to identify the features of the face region to identify the first local feature of that face region. A reference image is acquired, and the multiple target local feature recognition models are used to identify the second local features of multiple local regions of the reference image, respectively. For each of the plurality of target local feature recognition models, the feature similarity between the first local feature recognized by the target local feature recognition model and the second local feature recognized by the target local feature recognition model is obtained; Based on the obtained similarity of multiple features, it is determined whether the image to be verified passes the verification.

2. The method according to claim 1, wherein, The process of acquiring a reference image and using the multiple target local feature recognition models to identify the second local features of multiple local regions of the reference image includes: The reference image is acquired and divided into different face regions; For each face region in the reference image, a target local feature recognition model for recognizing the features of the face region is used to identify the second local feature of the face region.

3. The method according to claim 1, wherein, The step of determining whether the image to be verified passes verification based on the acquired similarity of multiple features includes: In response to determining that at least one of the plurality of feature similarities satisfies a first similarity threshold, the image to be verified is determined to pass verification.

4. The method according to claim 1, wherein, The step of determining whether the image to be verified passes verification based on the acquired similarity of multiple features includes: In response to determining that each of the plurality of feature similarities satisfies a second similarity threshold, the image to be verified is determined to pass verification.

5. A method for verifying an image, comprising: The image to be verified is acquired, and the first global feature of the image to be verified is identified using a trained feature recognition model; A reference image is acquired, and the trained feature recognition model is used to identify the second global features of the reference image; In response to determining that the similarity between the first global feature and the second global feature satisfies a third similarity threshold, the image to be verified is performed using the method described in any one of claims 1-4; or... In response to determining that the similarity between the first global feature and the second global feature does not meet the third similarity threshold, it is determined that the image to be verified has failed verification.

6. An apparatus for verifying an image, comprising: The first recognition unit is configured to acquire the image to be verified and use multiple target local feature recognition models to identify the first local features of multiple local regions of the image to be verified. The second recognition unit is configured to acquire a reference image and use the multiple target local feature recognition models to recognize the second local features of multiple local regions of the reference image respectively. The matching unit is configured to, for each of the plurality of target local feature recognition models, obtain the feature similarity between a first local feature recognized by the target local feature recognition model and a second local feature recognized by the target local feature recognition model; The verification unit is configured to determine whether the image to be verified passes verification based on the acquired similarity of multiple features. The first identification unit includes: The first segmentation module is configured to acquire the face image to be verified, segment the face region using a pre-trained facial region segmentation model, or segment the face region based on the proportion data or region location information of each pre-set face region. The first recognition module is configured to identify the first local feature of each face region in the face image to be verified using a target local feature recognition model for recognizing the features of the face region.

7. The apparatus according to claim 6, wherein, The second identification unit includes: The second segmentation module is configured to acquire the reference image and segment the reference image into different face regions; The second recognition module is configured to identify the second local features of each face region in the reference image using a target local feature recognition model for recognizing the features of the face region.

8. The apparatus according to claim 6, wherein, The verification unit includes: The first verification module is configured to determine that the image to be verified passes verification in response to determining that at least one feature similarity among the plurality of feature similarities satisfies a first similarity threshold.

9. The apparatus according to claim 6, wherein, The verification unit includes: The second verification module is configured to determine that the image to be verified passes verification in response to determining that each of the plurality of feature similarities satisfies a second similarity threshold.

10. An apparatus for verifying an image, comprising: The third recognition unit is configured to acquire the image to be verified and use a trained feature recognition model to recognize the first global features of the image to be verified. The fourth recognition unit is configured to acquire a reference image and use the trained feature recognition model to recognize the second global features of the reference image; The first verification unit is configured to verify the image to be verified using the method of any one of claims 1-4 in response to determining that the similarity between the first global feature and the second global feature meets a third similarity threshold; or... The second verification unit is configured to determine that the image to be verified has failed verification in response to determining that the similarity between the first global feature and the second global feature does not meet the third similarity threshold.

11. An electronic device, comprising: At least one processor; as well as A memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer-readable storage medium storing computer instructions, wherein, The computer instructions are used to cause the computer to perform the method according to any one of claims 1-5.