Face recognition method and device, computer device and storage medium
By using an initial recognition network model and an initial domain classifier trained adversarially, a high-efficiency face recognizer was generated, solving the problems of long training time and low recognition accuracy of traditional face recognition models, especially the poor recognition effect of cartoon data, and realizing accurate face recognition in different application scenarios.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TENCENT TECHNOLOGY (SHENZHEN) CO LTD
- Filing Date
- 2022-03-22
- Publication Date
- 2026-06-26
AI Technical Summary
Traditional face recognition models are time-consuming to train and have low recognition accuracy, especially when processing cartoon data. This is mainly due to insufficient training data for large models and the complexity of adjusting small models.
Adversarial training is performed using an initial recognition network model and an initial domain classifier. Adversarial learning is used to obtain the common distribution among different categories of face images, expand the training data, and generate an efficient face recognizer.
It improves the adversarial learning process, solves different types of technical problems, and addresses the recognition accuracy of traditional face recognition devices that have not been effectively solved by existing technologies. It has made a practical contribution to solving technical problems.
Smart Images

Figure CN116863512B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of artificial intelligence technology, and in particular to a method, apparatus, computer device, and storage medium for generating a face recognition device. Background Technology
[0002] With the development of artificial intelligence technology and the increasing emphasis on intellectual property and brand protection, it is necessary to perform facial recognition and verification on the data displayed on different application platforms or websites, including image data such as cartoon data, to determine whether the cartoon data includes facial data and the specific person to whom the facial data belongs.
[0003] Because the Internet generates massive amounts of cartoon or video data in real time, the traditional manual review method is time-consuming, labor-intensive, and inefficient. Therefore, a new processing method has emerged that involves model distillation and model overlay to obtain a suitable recognition model for review.
[0004] Model distillation refers to using a large model to distill knowledge from smaller models. Training the large model requires collecting a large amount of cartoon face data, but the collected cartoon data cannot reach the scale of real face data, resulting in a performance deficiency in the trained large model and a relatively low recognition accuracy. Furthermore, since model distillation requires the stacking of smaller models—that is, training multiple large models and then stacking them—and the weight ratios between each small model need to be adjusted to determine their respective weights and corresponding training periods, a long period of experimentation and refinement is required. This leads to a long overall training time, and the recognition accuracy of the trained model cannot be guaranteed. Summary of the Invention
[0005] Therefore, it is necessary to provide a face recognition generator generation method, apparatus, computer equipment, and storage medium that can improve the accuracy of recognizing the identity of a face in order to address the above-mentioned technical problems.
[0006] Firstly, this application provides a method for generating a face recognition device. The method includes:
[0007] The initial recognition network model obtained from the initial training is used to perform facial feature recognition on the training samples in the training sample set to obtain the corresponding facial features; the training samples include face images of the first category and face images of the second category;
[0008] The facial features are classified and identified using the initial domain classifier obtained from the initial training, and the corresponding classification results are generated.
[0009] Based on the facial features and the classification results, the initial recognition network model and the initial domain classifier are subjected to adversarial training, and the initial recognition network model at the end of training is used as the trained face recognizer.
[0010] In one embodiment, the step of performing adversarial training on the initial recognition network model and the initial domain classifier based on the facial features and the classification result, and using the initial recognition network model at the end of training as the trained face recognizer, includes:
[0011] The classification results are then superimposed onto the initial recognition network model in a reverse manner to update the initial recognition network model;
[0012] Based on the facial features output by the updated initial recognition network model, the first parameter of the initial domain classifier is adjusted;
[0013] When the first iteration number of the adversarial training is determined, or the joint loss function value of the initial recognition network model and the initial domain classifier satisfies the first training termination condition, the initial recognition network model at the end of training is used as the trained face recognizer.
[0014] In one embodiment, the step of adding the classification result back to the initial recognition network model to update the initial recognition network model includes:
[0015] Based on the reverse effect of the classification results, determine the corresponding parameter adjustment strategy;
[0016] According to the determined parameter adjustment strategy, the second parameter of the initial recognition network model is adjusted within the corresponding parameter adjustment period to obtain the updated initial recognition network model.
[0017] In one embodiment, adjusting the second parameter of the initial recognition network model according to the determined parameter adjustment strategy within a corresponding parameter adjustment period to obtain an updated initial recognition network model includes:
[0018] Obtain the current iteration number in the current training process, and determine the corresponding parameter adjustment period based on the current iteration number and the preset total iteration number;
[0019] During the parameter adjustment period, the second parameter of the initial recognition network model is adjusted according to the determined parameter adjustment strategy to obtain the updated initial recognition network model.
[0020] In one embodiment, the classification result identified by the initial domain classifier includes a first category label and a second category label, and the facial features include first category facial features corresponding to the first category label and second category facial features corresponding to the second category label; the step of determining the corresponding parameter adjustment strategy based on the reverse action of the classification result includes:
[0021] Based on the reverse effect of the classification results, the common distribution information between the first category face features corresponding to the first category label and the second category face features corresponding to the second category label is determined;
[0022] Based on the common distribution information, the corresponding parameter adjustment strategy is determined.
[0023] In one embodiment, the initial recognition network model is trained in the following ways:
[0024] The original recognition network model is used to identify the facial feature map corresponding to each training sample in the training sample set, and the facial features corresponding to the facial feature map are extracted.
[0025] Based on the facial features and the label information of the training samples in the training sample set that correspond to the facial feature map, the corresponding facial recognition objective function value is determined.
[0026] The original recognition network model is updated using gradient descent until the face recognition objective function value or the second iteration number of iterative training meets the second training stopping condition, thus obtaining the initial recognition network model after initial training.
[0027] In one embodiment, the initial domain classifier is trained in the following ways:
[0028] Obtain the facial features output by the initial recognition network model;
[0029] The facial features are identified using the original classifier, and category labels corresponding to each facial feature are output.
[0030] Based on the identified category labels and the pre-labeled category labels of the training samples in the training sample set corresponding to the facial features, the corresponding loss function value is determined.
[0031] The original classifier is updated using gradient descent until the loss function value or the third iteration of the training meets the third training stopping condition, thus obtaining the initial domain classifier after initial training.
[0032] Secondly, this application also provides a face recognition generator generating apparatus. The apparatus includes:
[0033] The face feature recognition module is used to perform face feature recognition on training samples in the training sample set using the initial recognition network model obtained from the initial training, and to obtain the corresponding face features; the training samples include face images of the first category and face images of the second category;
[0034] The classification and recognition module is used to classify and recognize the facial features using the initial domain classifier obtained from the initial training, and generate the corresponding classification results.
[0035] The face recognition generator generation module is used to perform adversarial training on the initial recognition network model and the initial domain classifier based on the face features and the classification results, and use the initial recognition network model at the end of training as the trained face recognition generator.
[0036] Thirdly, this application also provides a computer device. The computer device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to perform the following steps:
[0037] The initial recognition network model obtained from the initial training is used to perform facial feature recognition on the training samples in the training sample set to obtain the corresponding facial features; the training samples include face images of the first category and face images of the second category;
[0038] The facial features are classified and identified using the initial domain classifier obtained from the initial training, and the corresponding classification results are generated.
[0039] Based on the facial features and the classification results, the initial recognition network model and the initial domain classifier are subjected to adversarial training, and the initial recognition network model at the end of training is used as the trained face recognizer.
[0040] Fourthly, this application also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program thereon, which, when executed by a processor, performs the following steps:
[0041] The initial recognition network model obtained from the initial training is used to perform facial feature recognition on the training samples in the training sample set to obtain the corresponding facial features; the training samples include face images of the first category and face images of the second category;
[0042] The facial features are classified and identified using the initial domain classifier obtained from the initial training, and the corresponding classification results are generated.
[0043] Based on the facial features and the classification results, the initial recognition network model and the initial domain classifier are subjected to adversarial training, and the initial recognition network model at the end of training is used as the trained face recognizer.
[0044] Fifthly, this application also provides a computer program product. The computer program product includes a computer program that, when executed by a processor, performs the following steps:
[0045] The initial recognition network model obtained from the initial training is used to perform facial feature recognition on the training samples in the training sample set to obtain the corresponding facial features; the training samples include face images of the first category and face images of the second category;
[0046] The facial features are classified and identified using the initial domain classifier obtained from the initial training, and the corresponding classification results are generated.
[0047] Based on the facial features and the classification results, the initial recognition network model and the initial domain classifier are subjected to adversarial training, and the initial recognition network model at the end of training is used as the trained face recognizer.
[0048] In the aforementioned face recognition generator generation method, apparatus, computer equipment, and storage medium, an initial recognition network model obtained through initial training is used to perform face feature recognition on training samples in the training sample set to obtain corresponding face features. Then, an initial domain classifier obtained through initial training is used to classify and recognize the face features, generating corresponding classification results. Furthermore, based on the face features and classification results, adversarial training can be performed on the initial recognition network model and the initial domain classifier, and the initial recognition network model at the end of training is used as the trained face recognition generator. Because the recognition network model and domain classifier undergo adversarial training, the trained face recognition generator can learn the common distribution among different categories of face images during the adversarial learning process. Therefore, based on face images with common distributions, data augmentation can be performed on face category images where training sample data is scarce. This augmented training sample data is then used to train a face recognition generator for application scenarios. Consequently, in different practical application scenarios, accurate face recognition can be performed on the face images to be recognized in those scenarios, determining the person to whom these face images belong, thereby improving the accuracy of face recognition. Attached Figure Description
[0049] Figure 1 This is a diagram illustrating the application environment of a face recognition generator generation method in one embodiment.
[0050] Figure 2 This is a flowchart illustrating a face recognition generator generation method in one embodiment;
[0051] Figure 3 This is a flowchart illustrating the process of obtaining a trained face recognition system in one embodiment;
[0052] Figure 4 This is a schematic diagram illustrating the process of training an initial recognition network model in one embodiment;
[0053] Figure 5 This is a schematic diagram illustrating the process of training an initial domain classifier in one embodiment.
[0054] Figure 6 This is a schematic diagram of the overall process of a face recognition generator generation method in one embodiment;
[0055] Figure 7 This is a structural block diagram of a face recognition generator generating device in one embodiment;
[0056] Figure 8 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation
[0057] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0058] The face recognition generator generation method provided in this application relates to artificial intelligence (AI) technology. AI is the theory, method, technology, and application system that uses digital computers or machines controlled by digital computers to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use that knowledge to obtain optimal results. In other words, AI is a comprehensive technology within computer science that attempts to understand the essence of intelligence and produce a new type of intelligent machine that can react in a way similar to human intelligence. AI studies the design principles and implementation methods of various intelligent machines, enabling them to possess perception, reasoning, and decision-making capabilities.
[0059] Artificial intelligence (AI) is a comprehensive discipline encompassing a wide range of fields, including both hardware and software technologies. Fundamental AI technologies generally include sensors, dedicated AI chips, cloud computing, distributed storage, big data processing, operating / interactive systems, and mechatronics. AI software technologies primarily include computer vision, speech processing, natural language processing, and machine learning / deep learning. Computer vision (CV) is the science that studies how to enable machines to "see." More specifically, it refers to machine vision, which uses cameras and computers to replace human eyes in recognizing and measuring targets, and further processes images to create images more suitable for human observation or transmission to instruments. As a scientific discipline, computer vision studies related theories and technologies, attempting to build AI systems capable of extracting information from images or multidimensional data. Computer vision technology typically includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content / behavior recognition, 3D object reconstruction, 3D technology, virtual reality, augmented reality, simultaneous localization and mapping, and other technologies, as well as common biometric recognition technologies such as face recognition and fingerprint recognition.
[0060] With the research and advancement of artificial intelligence (AI) technology, AI is being studied and applied in various fields, such as smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, autonomous driving, drones, robots, smart healthcare, and smart customer service. It is believed that with the development of technology, AI will be applied in more fields and play an increasingly important role.
[0061] The face recognition generator generation method provided in this application embodiment involves artificial intelligence image recognition and other technologies, and can be applied to, for example... Figure 1In the application environment shown, terminal 102 communicates with server 104 via a network. A data storage system can store the data that server 104 needs to process. The data storage system can be integrated onto server 104 or located in the cloud or on another network server. Server 104 uses an initial recognition network model obtained from initial training to perform facial feature recognition on training samples in the training sample set, obtaining the corresponding facial features. The training samples include first-category and second-category facial images. The training sample set can be stored locally on terminal 102 or in the data storage system corresponding to server 104. Server 104 uses an initial domain classifier obtained from initial training to classify and recognize facial features, generating corresponding classification results. Then, based on the facial features and classification results, it performs adversarial training on the initial recognition network model and the initial domain classifier, and uses the initial recognition network model at the end of training as the trained facial recognizer. The terminal 102 can be, but is not limited to, various personal computers, laptops, smartphones, tablets, IoT devices, and portable wearable devices. IoT devices can include smart voice interaction devices, smart speakers, smart TVs, smart air conditioners, smart vehicle devices, and aircraft, etc. Portable wearable devices can include smartwatches, smart bracelets, head-mounted devices, etc. The server 104 can be implemented using a standalone server or a server cluster consisting of multiple servers. This invention can be applied to various scenarios, including but not limited to cloud technology, artificial intelligence, smart transportation, and assisted driving.
[0062] In one embodiment, such as Figure 2 As shown, a method for generating a face recognition device is provided, which can be applied to... Figure 1 Taking the server in the example, the following steps are included:
[0063] Step S202: Using the initial recognition network model obtained from the initial training, perform facial feature recognition on the training samples in the training sample set to obtain the corresponding facial features.
[0064] Specifically, the original recognition model is initially trained using the training sample set to obtain the initial recognition model after initial training. Then, the initial recognition model is used to perform facial feature recognition on each training sample in the training sample set to obtain the corresponding facial features.
[0065] The training samples include first-category face images and second-category face images. The first-category face images can be real face images, while the second-category face images can be cartoon face images. Since the data size of cartoon face images is relatively small, this embodiment uses both real and cartoon face images as training samples. Through adversarial training, common distribution information between cartoon and real face images is obtained. Based on this common distribution information, the training data for cartoon face images can be expanded using a large amount of real face image training data. Furthermore, by aligning the feature space of cartoon face images with that of normal face images, the classification ability of normal face images in the feature space can be utilized to improve the classification ability of cartoon face images in the feature space.
[0066] In one embodiment, the initial recognition network model is used to perform facial feature recognition on the training samples to obtain the corresponding facial features, and further calculate the facial target recognition function value based on the obtained facial features and the label information of the training samples corresponding to the facial features, and then determine the user identity to which the training sample belongs based on the magnitude of the facial target recognition function value.
[0067] Specifically, the label information of the training samples can be understood as the ID information of the corresponding training samples. Based on the identity information of the person represented by the ID information and the facial features identified, the facial target recognition function value between the two is determined, and it is judged whether the difference between the facial target recognition function value and the preset recognition function threshold is less than the preset difference.
[0068] Specifically, when the difference between the face target recognition function value and the preset recognition function threshold is less than the preset difference, it indicates that the user identity determined by the current face feature recognition is consistent with the pre-labeled person identity information, and the current face recognition result is correct.
[0069] Furthermore, when the initial recognition network model is used to recognize real face images, it can determine the real user identity to which the training sample belongs. When the initial recognition network model is used to recognize cartoon face images, it can determine the specific cartoon character to which the training sample belongs.
[0070] Step S204: Use the initial domain classifier obtained from the initial training to classify and recognize facial features, and generate the corresponding classification results.
[0071] Specifically, by utilizing the facial features output by the initial recognition network model, the original classifier is initially trained to obtain the initial domain classifier after initial training. Then, the initial domain classifier is used to classify and recognize the facial features output by the recognition network model to generate the corresponding classification results.
[0072] The classification results include a first category label and a second category label. The first category label corresponds to the first category of facial features, and the second category label corresponds to the second category of facial features. The first category of facial features is extracted from the first category of facial images used as training samples, and the second category of facial features is extracted from the second category of facial images used as training samples.
[0073] Furthermore, the first category of face images can be real face images, while the second category of face images can be cartoon face images. The two can also be interchanged. That is, the first and second categories only serve to distinguish between them and are not specifically limited. The purpose of this embodiment is to enable the initial recognition network model to learn the common distribution between different categories of face images, so as to expand the training dataset of image categories with a small number of images.
[0074] Step S206: Based on facial features and classification results, perform adversarial training on the initial recognition network model and the initial domain classifier, and use the initial recognition network model at the end of training as the trained face recognizer.
[0075] Specifically, the classification results are superimposed onto the initial recognition network model in a reverse manner to update the initial recognition network model. Based on the facial features output by the updated initial recognition network model, the first parameter of the initial domain classifier is adjusted. Then, when the first iteration of adversarial training is determined, or when the joint loss function value of the initial recognition network model and the initial domain classifier meets the first training termination condition, the initial recognition network model at the end of training is used as the trained face recognizer.
[0076] Furthermore, the classification results fed back by the initial domain classifier include whether the facial features belong to the first or second category label. The initial domain classifier is trained using gradient descent, which aims to minimize the loss function to improve its classification accuracy. When the feedback from the initial domain classifier is superimposed on the initial network recognition model, the gradient is used as an adjustment unit. That is, the second parameter of the initial recognition network model is adjusted based on the gradient, rather than the objective function. However, since the reverse gradient adjustment causes changes in the facial recognition objective function value of the initial recognition network model, it may lead to a deterioration in facial recognition ability. Therefore, it is necessary to use the facial recognition objective function as a constraint to maintain the facial recognition ability of the initial recognition network model as much as possible while adjusting the second parameter.
[0077] Similarly, when the updated initial recognition network model is obtained after adjusting the model parameters, the facial features extracted by the updated initial recognition network model are fed back to the initial domain classifier. The objective function of the initial domain classifier itself does not change, but the first parameter of the initial domain classifier will be adjusted. In other words, the initial recognition network model and the initial domain classifier in the adversarial training process are mutually competitive and influence each other.
[0078] The purpose of adversarial training on the initial recognition network model and the initial domain classifier is to enable the initial recognition network model to learn the common distribution among different categories of face images fed back by the initial domain classifier. By adjusting the model parameters of the initial recognition network model, an updated initial recognition network model is obtained. When the face features extracted by the updated initial recognition network model are fed back to the initial domain classifier for classification, the initial domain classifier is unable to accurately identify whether the face features belong to the first category label or the second category label.
[0079] Understandably, during training, the initial recognition network model aims to acquire the common distribution of face images from different categories, while the initial domain classifier aims to identify the face category to which the face features fed back by the initial recognition network model belong. However, adversarial training allows for mutual adjustment and updates between the initial recognition network model and the initial domain classifier. When the first iteration of adversarial training is completed, or when the joint loss function value of the initial recognition network model and the initial domain classifier meets the first training termination condition, the two ultimately reach an adversarial equilibrium. At this point, the initial recognition network model at the end of training is used as the trained face recognizer, while the initial domain classifier serves as an auxiliary training tool and does not need to be used in subsequent model deployment; only the trained face recognizer needs to be exported.
[0080] In one embodiment, after obtaining the trained face recognizer, the method further includes: exporting the trained face recognizer and deploying the trained face recognizer to the face recognition system.
[0081] Specifically, since the initial domain classifier is used as an auxiliary training tool to achieve adversarial training between the initial recognition network model and the initial domain classifier, it is not necessary to use it when deploying the model later. It is only necessary to export the trained face recognizer and deploy the trained face recognizer to the face recognition system.
[0082] The face recognition system can be a real face recognition system or a cartoon face recognition system. In this embodiment, since the cartoon face data is relatively small, real face images are needed to expand the training data when training the face recognizer. The expanded training data can then be used to train the face recognizer, which can then be applied to the cartoon face recognition system to identify specific cartoon characters and improve the accuracy of cartoon character recognition.
[0083] In the aforementioned face recognition generator generation method, an initial recognition network model obtained through initial training is used to perform face feature recognition on training samples in the training sample set to obtain corresponding face features. Then, an initial domain classifier obtained through initial training is used to classify and recognize the face features, generating corresponding classification results. Furthermore, based on the face features and classification results, adversarial training can be performed on the initial recognition network model and the initial domain classifier, and the initial recognition network model at the end of training is used as the trained face recognition generator. Because the initial recognition network model and the initial domain classifier undergo adversarial training, the trained face recognition generator can learn the common distributions among different categories of face images during the adversarial learning process. Therefore, based on face images with common distributions, data augmentation can be performed on face category images where training sample data is scarce. This augmented training sample data is then used to train a face recognition generator for application scenarios. Consequently, in different practical application scenarios, accurate face recognition can be performed on the face images to be recognized in those scenarios, determining the person to whom these face images belong, thereby improving the accuracy of face recognition.
[0084] In one embodiment, such as Figure 3 As shown, the steps to obtain a trained face recognizer, namely, based on facial features and classification results, adversarial training is performed on the initial recognition network model and the initial domain classifier, and the initial recognition network model at the end of training is used as the trained face recognizer, specifically include:
[0085] Step S302: The classification results are superimposed on the initial recognition network model with a reverse action to update the initial recognition network model.
[0086] Specifically, by obtaining the classification results fed back by the initial domain classifier and determining the corresponding parameter adjustment strategy based on the feedback of the classification results, the second parameter of the initial recognition network model is adjusted within the corresponding parameter adjustment period according to the determined parameter adjustment strategy, so as to obtain the updated recognition network model.
[0087] The training objective of the initial domain classifier is to identify the face category to which the face features reported by the initial recognition network model belong. The classification result reported by the initial domain classifier indicates whether the face feature belongs to the first or second category label. The initial domain classifier is trained using gradient descent, which continuously reduces the loss function of the initial domain classifier to improve its classification accuracy. Methods such as stochastic gradient descent (SGD), adaptive gradient (Adagrad), batch gradient descent (BGD), adaptive momentestimation (Adam), and stochastic gradient descent with a moving average term are used to train and optimize the initial domain classifier.
[0088] Furthermore, since the initial domain classifier needs to be optimized and trained using gradient descent to make the category labels of the identified facial features more accurate, the feedback effect of the classification result is reflected in the inverse gradient adjustment. That is, the parameter adjustment strategy is to use gradient ascent, using the gradient as the adjustment unit. When adjusting the second parameter of the initial recognition network model, gradient ascent is used so that the initial recognition network model after parameter adjustment can learn the common distribution between different categories of facial images.
[0089] Step S304: Adjust the first parameter of the initial domain classifier based on the face features output by the updated initial recognition network model.
[0090] Specifically, after updating the initial recognition network model, the updated facial features output by the updated initial recognition network model are further obtained, and the updated facial features are input into the initial domain classifier. The initial domain classifier is trained and updated using the updated facial features. Specifically, the first parameter of the initial domain classifier is adjusted using the updated facial features.
[0091] The purpose of adversarial training is to achieve mutual adjustment and updating between the initial recognition network model and the initial domain classifier, so that the two eventually reach an adversarial equilibrium. This allows the initial recognition network model to learn the common distribution among different categories of face images fed back by the initial domain classifier. By adjusting the second parameter of the initial recognition network model, an updated initial recognition network model is obtained. When the face features extracted by the updated initial recognition network model are fed back to the initial domain classifier for classification, the initial domain classifier cannot accurately identify whether the face features belong to the first category label or the second category label.
[0092] In one embodiment, during the adversarial training process of the initial recognition network model and the initial domain classifier, the following formulas (1) and (2) are specifically used for adversarial training:
[0093] (1);
[0094] (2);
[0095] The adversarial training layer serves two purposes: when performing a forward pass, its input and output are consistent, as represented by formula (1); when performing a backward pass, i.e., gradient update by adding a negative sign "-", it is represented by formula (2). In formula (1), X represents the input data, and R(X) represents the response after the forward pass of the adversarial training layer, thus ensuring consistent input and output. In formula (2), When the reverse action of the adversarial training layer is performed, the degree of gradient adjustment can be obtained by taking the derivative of the response function, which can also be understood as a specific parameter adjustment strategy. I represents the gradient, and γ represents the function that changes with the number of iterations p. In formula (2), the gradient I is inverted, and the specific degree of gradient adjustment is obtained by taking the derivative of the response function.
[0096] Furthermore, the function γ, which varies with the number of iterations p, is expressed by the following formulas (3) and (4):
[0097] (3);
[0098] (4);
[0099] Where p represents the current iteration number and k is the total iteration number. When the remainder of p divided by 3 is equal to 0, the function γ is adjusted according to formula (3) to achieve the purpose of adjusting the gradient adjustment degree. In formula (4), when the remainder of p divided by 3 is not equal to 0, the value of the γ function is 0, and the gradient adjustment degree is also 0, so no adjustment is needed.
[0100] Specifically, during adversarial training, the parameter adjustment period can be further determined according to formulas (3) and (4). This is reflected in the initial domain classifier being trained for two steps, i.e., adjusting two gradients downwards. When the classification result of the third step is obtained, a reverse gradient is passed to the initial recognition network model, i.e., adjusting the second parameter of the initial recognition network model upwards. Then, when the first iteration of adversarial training or the joint loss function value of the initial recognition network model and the initial domain classifier meets the first training termination condition, the training ends, and the initial recognition network model at the end of training is used as the trained face recognizer.
[0101] Step S306: When the first iteration number of the adversarial training is determined, or the joint loss function value of the initial recognition network model and the initial domain classifier satisfies the first training termination condition, the initial recognition network model at the end of training is used as the trained face recognizer.
[0102] Specifically, training ends when the first iteration of adversarial training satisfies the first training termination condition, or when the joint loss function value of the initial recognition network model and the initial domain classifier satisfies the first training termination condition, and the initial recognition network model at the end of training is used as the trained face recognizer.
[0103] The first iteration count satisfies the first training termination condition, indicating that the first iteration count has reached the corresponding first threshold. The joint loss function value of the initial recognition network model and the initial domain classifier satisfies the first training termination condition, indicating that the joint loss function has reached the corresponding first loss threshold.
[0104] In this embodiment, the classification results are superimposed onto the initial recognition network model in a reverse manner to update the initial recognition network model. Based on the facial features output by the updated initial recognition network model, the first parameter of the initial domain classifier is adjusted. Then, when the first iteration of adversarial training is determined, or when the joint loss function value of the initial recognition network model and the initial domain classifier meets the first training termination condition, the initial recognition network model at the end of training is used as the trained face recognizer. This achieves adversarial training between the initial recognition network model and the initial domain classifier, and the adjustment of their respective parameters. This allows the finally trained face recognizer to learn the common distribution among different categories of face images during the adversarial learning process. Based on the augmented training sample data, a face recognizer for the application scenario can be trained, enabling accurate face recognition of face images in different practical application scenarios.
[0105] In one embodiment, the step of adjusting the second parameter of the initial recognition network model within a corresponding parameter adjustment period according to a determined parameter adjustment strategy to obtain an updated initial recognition network model includes:
[0106] Obtain the current iteration number in the current training process, and determine the corresponding parameter adjustment period based on the current iteration number and the preset total iteration number; within the parameter adjustment period, adjust the second parameter of the initial recognition network model according to the determined parameter adjustment strategy to obtain the updated initial recognition network model.
[0107] Specifically, by obtaining the current iteration number in the current training process and the preset total iteration number, it is determined whether the remainder when the current iteration number is divided by 3 is 0. When it is determined that the remainder when the current iteration number is divided by 3 is 0, the parameter adjustment period is calculated based on the current iteration number and the preset total iteration number to determine the corresponding parameter adjustment period.
[0108] The parameter adjustment strategy is determined based on the reverse effect of the classification results. After determining the parameter adjustment period, the second parameter of the initial recognition network model is further adjusted within the corresponding parameter adjustment period according to the determined parameter adjustment strategy.
[0109] In this embodiment, the current iteration number in the current training process is obtained, and a corresponding parameter adjustment period is determined based on the current iteration number and the preset total iteration number. Then, within the parameter adjustment period, the second parameter of the initial recognition network model is adjusted according to the determined parameter adjustment strategy to obtain an updated initial recognition network model. This achieves the adjustment of the second parameter of the initial recognition network model according to the determined parameter adjustment strategy within the corresponding parameter adjustment period, so that the updated initial recognition network model can acquire the common distribution of face images of different categories. Based on the face images with the common distribution, data augmentation is performed, and a face recognizer for the application scenario is trained using the augmented training sample data. This enables accurate face recognition of face images to be recognized in different practical application scenarios.
[0110] In one embodiment, the classification result identified by the initial domain classifier includes a first category label and a second category label, and the facial features include first category facial features corresponding to the first category label and second category facial features corresponding to the second category label. The step of determining the corresponding parameter adjustment strategy based on the reverse effect of the classification result includes:
[0111] Based on the reverse effect of the classification results, the common distribution information between the first category face features corresponding to the first category label and the second category face features corresponding to the second category label is determined; based on the common distribution information, the corresponding parameter adjustment strategy is determined.
[0112] Specifically, based on the reverse effect of the classification results fed back by the initial domain classifier, feature fine-tuning of the second category face features is achieved based on the first category face features, and feature space alignment is performed between the first category face features and the second category face features to determine the common distribution information between the first category face features corresponding to the first category label and the second category face features corresponding to the second category label. Furthermore, based on the common distribution information, the corresponding parameter adjustment strategy is determined.
[0113] The parameter adjustment strategy can be understood as using an adversarial training layer to perform adversarial training on the initial recognition network model and the initial domain classifier. When the inverse effect of the classification result fed back by the initial domain classifier is superimposed on the initial recognition network model, the corresponding gradient adjustment degree is determined based on the common distribution information. The determined gradient adjustment degree is used as the parameter adjustment strategy, that is, the gradient is used as the adjustment unit to perform inverse gradient adjustment on the second parameter of the initial recognition network model, so as to obtain the updated initial recognition network model.
[0114] In this embodiment, based on the reverse effect of the classification results, the common distribution information between the first category face features corresponding to the first category label and the second category face features corresponding to the second category label is determined. Then, based on the common distribution information, a corresponding parameter adjustment strategy is determined. This achieves the determination of parameter adjustment strategies based on common distribution information, allowing for parameter adjustment of the initial recognition network model. The updated initial recognition network model can learn the common distribution between face images of different categories, enabling data augmentation of face category images with scarce training sample data based on face images with common distributions. This allows for training a face recognizer for the application scenario using the augmented training sample data, thereby improving the face recognition accuracy of the face recognizer.
[0115] In one embodiment, such as Figure 4 As shown, the steps for training the initial recognition network model specifically include:
[0116] Step S402: Identify the face feature map corresponding to each training sample in the training sample set through the original recognition network model, and extract the face features corresponding to the face feature map.
[0117] Specifically, by using the first category face images and the second category face images as the same batch of training sample sets, that is, by simultaneously using multiple first category face images and second category face images as training sample sets, the original recognition network model is input into the original recognition network model to identify the face feature maps corresponding to each training sample in the training sample set, and extract the face features corresponding to the identified face feature maps.
[0118] The first category of face images can be real face images, while the second category of face images can be cartoon face images. The two categories can also be interchanged. That is, the first and second categories only serve to distinguish between them and do not impose specific restrictions.
[0119] Furthermore, specifically, spatial feature extraction is performed on the training samples to obtain the corresponding facial feature maps, which retain the spatial structure information of the training samples. Specifically, a convolutional neural network (CNN) is used to perform convolution, ReLU activation function, and pooling operations on the training samples to extract the facial feature maps. These facial feature maps are then input into a fully connected mapping network, which outputs the corresponding facial features, which can be represented as vectors.
[0120] Step S404: Based on the facial features and the label information of the training samples in the training sample set corresponding to the facial feature map, determine the corresponding facial recognition objective function value.
[0121] Specifically, the facial features output by the fully connected mapping network and the label information of the facial images that generated the facial features are obtained, that is, the label information of the training samples corresponding to the facial feature maps in the training sample set. Based on the facial features and label information as input data, the facial recognition objective function value of the selected facial recognition objective function is calculated.
[0122] The objective function for face recognition can be a classification function, such as the softmax function, various softmax functions with added margins, or other types of objective functions.
[0123] Step S406: The original recognition network model is updated using gradient descent until the face recognition objective function value or the second iteration number of iterative training meets the second training stopping condition, thus obtaining the initial recognition network model after initial training.
[0124] Specifically, the initial recognition network model is trained using gradient descent. This involves continuously reducing the loss function of the initial recognition network model through gradient descent, thereby improving its recognition accuracy. For example, stochastic gradient descent (SGD), adaptive gradient (Adagrad), batch gradient descent (BGD), adaptive momentestimation (Adam), and stochastic gradient descent with moving averages are used to train and optimize the original recognition network model, thereby updating the original recognition network model and obtaining the initially trained initial recognition network model.
[0125] Furthermore, when the face recognition objective function value satisfies the second training stopping condition, or when the second iteration number of iterative training satisfies the second training stopping condition, training ends, and the initial recognition network model after initial training is obtained.
[0126] Among them, the face recognition objective function value satisfies the second training stopping condition, which means that the loss value of the face recognition objective function is less than the corresponding second loss threshold, and the second iteration number of iterative training satisfies the second training stopping condition, which means that the second iteration number has reached the corresponding second number threshold.
[0127] In this embodiment, the original recognition network model identifies the facial feature maps corresponding to each training sample in the training sample set and extracts the facial features corresponding to the facial feature maps. Then, based on the facial features and the label information of the training samples in the training sample set corresponding to the facial feature maps, the corresponding facial recognition objective function value is determined. Gradient descent is used to update the original recognition network model until the facial recognition objective function value or the second iteration of the iterative training meets the second training stopping condition, thus obtaining the initial recognition network model after initial training. This achieves iterative and optimized training of the original recognition network model until the corresponding second training stopping condition is reached, resulting in a well-trained initial recognition network model. This reduces the recognition error rate of the current original recognition network model, thereby reducing unnecessary error factors during subsequent adversarial training and further improving the facial recognition accuracy of the final generated facial recognizer.
[0128] In one embodiment, such as Figure 5 As shown, the steps for training the initial domain classifier specifically include:
[0129] Step S502: Obtain the facial features output by the initial recognition network model.
[0130] Specifically, by inputting each training sample from the training sample set into the initial recognition network model and obtaining the facial features output by the initial recognition network model, the facial features are used as input data for the original classifier, and the original classifier is initially trained using the facial features.
[0131] One approach is to use network search (NAS) to optimize the access point of the domain classifier, determine the access position between the initial domain classifier and the initial recognition network model, that is, to find the optimal access point and establish the connection between the initial domain classifier and the initial recognition network model.
[0132] Step S504: Identify facial features using the original classifier and output the category label corresponding to each facial feature.
[0133] Specifically, facial features are input into a primary classifier, which then identifies these features and outputs a category label corresponding to each feature. This category label includes a first category label and a second category label. Based on the output, the category to which a facial feature belongs—whether it belongs to the first or second facial category—can be determined.
[0134] For example, face images can include cartoon face images and real face images. Based on the output, the category to which the face features belong can be determined, that is, whether the training sample is a real face image or a cartoon face image.
[0135] Furthermore, specifically, a classification network composed of convolutional neural networks, i.e., the original classifier, is used to perform operations such as convolution calculation, ReLU nonlinear activation function calculation, and pooling calculation on facial features to determine the category to which the facial features belong. The structure of this original classifier can also be a fully connected network, which can be adjusted according to its input. The input to this classification network can be the facial features output from the intermediate layers of the initial recognition network model, or the facial features output from the final layer.
[0136] Step S506: Based on the identified category labels and the pre-labeled category labels of the training samples corresponding to the facial features in the training sample set, determine the corresponding loss function value.
[0137] Specifically, the identified category labels and the pre-labeled category labels of the training samples corresponding to the facial features in the training sample set are used as input data to calculate the loss function value of the cross-entropy objective function.
[0138] Furthermore, the cross-entropy objective function belongs to the binary cross-entropy formula, and the loss function value is expressed by the following formula (5):
[0139] (5);
[0140] Where L represents the loss value of the cross-entropy objective function, y represents the class label, t represents the output probability of the classification network, and i represents the number of samples.
[0141] Step S508: The original classifier is updated using gradient descent until the loss function value or the third iteration of the training meets the third training stopping condition, thus obtaining the initial domain classifier after initial training.
[0142] Specifically, the initial domain classifier is trained using gradient descent, which continuously reduces the loss function of the initial domain classifier to improve its classification accuracy. For example, stochastic gradient descent (SGD), adaptive gradient (Adagrad), batch gradient descent (BGD), adaptive momentestimation (Adam), and stochastic gradient descent with dynamic terms are used to train and optimize the initial domain classifier.
[0143] Furthermore, training ends when the loss function value of the cross-entropy objective function satisfies the third training stopping condition, or when the third iteration of iterative training satisfies the third training stopping condition, and the initial domain classifier after initial training is obtained.
[0144] Among them, the loss function value of the cross-entropy objective function satisfies the third training stopping condition, which means that the loss function value of the cross-entropy objective function is less than the corresponding third loss threshold. The third iteration number of iterative training satisfies the third training stopping condition, which means that the third iteration number has reached the corresponding third number threshold.
[0145] In this embodiment, facial features output by the initial recognition network model are obtained, and the original classifier is used to recognize these facial features, outputting category labels corresponding to each facial feature. Then, based on the recognized category labels and the pre-labeled category labels of the training samples corresponding to the facial features in the training sample set, the corresponding loss function value is determined. The original classifier is updated using gradient descent until the loss function value or the third iteration of training satisfies the third training stopping condition, resulting in an initial domain classifier after initial training. This iterative and optimized training of the original classifier until the corresponding third training stopping condition is met, resulting in a well-trained initial domain classifier. This reduces the recognition error rate of the current original classifier, minimizing unnecessary error factors during subsequent adversarial training, thereby further improving the facial recognition accuracy of the final generated facial recognition device.
[0146] In one embodiment, such as Figure 6 As shown, a method for generating a face recognition device is provided, which specifically includes the following steps:
[0147] Step 1) Identify the face feature map corresponding to each training sample in the training sample set through the original recognition network model, and extract the face features corresponding to the face feature map.
[0148] Step 2) Based on facial features and the label information of training samples in the training sample set that correspond to the facial feature map, determine the corresponding facial recognition objective function value.
[0149] Step 3) Update the original recognition network model using gradient descent and determine whether the face recognition objective function value or the second iteration number of iterative training meets the second training stopping condition.
[0150] Step 4) When the second training stopping condition is met, the initial recognition network model after initial training is obtained.
[0151] Step 5) Obtain the facial features output by the initial recognition network model, and use the original classifier to recognize the facial features, outputting the category label corresponding to each facial feature.
[0152] Step 6) Based on the identified category labels and the pre-labeled category labels of the training samples corresponding to the facial features in the training sample set, determine the corresponding loss function value.
[0153] Step 7) Update the original classifier using gradient descent and determine whether the loss function value or the third iteration of the iterative training meets the third training stopping condition.
[0154] Step 8): When the third training stopping condition is met, the initial domain classifier after initial training is obtained.
[0155] Step 9) Using the initial recognition network model obtained from the initial training, perform facial feature recognition on the training samples in the training sample set to obtain the corresponding facial features.
[0156] Step 10) Use the initial domain classifier obtained from the initial training to classify and recognize facial features and generate corresponding classification results.
[0157] Step 11) Based on the reverse effect of the classification results, determine the common distribution information between the first category face features corresponding to the first category label and the second category face features corresponding to the second category label.
[0158] Step 12) Determine the corresponding parameter adjustment strategy based on the common distribution information.
[0159] Step 13) Obtain the current iteration number in the current training process, and determine the corresponding parameter adjustment period based on the current iteration number and the preset total iteration number.
[0160] Step 14): During the parameter adjustment period, the second parameter of the initial recognition network model is adjusted according to the determined parameter adjustment strategy to obtain the updated initial recognition network model.
[0161] Step 15): Adjust the first parameter of the initial domain classifier based on the face features output by the updated initial recognition network model.
[0162] Step 16) Determine whether the first iteration number of the adversarial training or the joint loss function value of the initial recognition network model and the initial domain classifier satisfies the first training termination condition of the adversarial training.
[0163] Step 17): When the first training termination condition is met, the initial recognition network model at the end of training is used as the trained face recognizer.
[0164] Step 18) Export the trained face recognizer and deploy the trained face recognizer to the face recognition system.
[0165] In this embodiment, an initial recognition network model obtained through initial training is used to perform facial feature recognition on training samples in the training sample set to obtain corresponding facial features. An initial domain classifier obtained through initial training is then used to classify and recognize these facial features, generating corresponding classification results. Furthermore, adversarial training can be performed on the initial recognition network model and the initial domain classifier based on the facial features and classification results. The initial recognition network model at the end of training is then used as the trained facial recognizer. Because the recognition network model and domain classifier undergo adversarial training, the trained facial recognizer can learn the common distributions among different categories of facial images during the adversarial learning process. This allows for data augmentation of facial image categories with scarce training sample data based on these images with common distributions. The augmented training sample data is then used to train a facial recognizer for the application scenario, enabling accurate facial recognition of images in different practical application scenarios and determining the individuals associated with these images, thereby improving the accuracy of facial recognition.
[0166] It should be understood that although the steps in the flowcharts of the above embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the above embodiments may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.
[0167] Based on the same inventive concept, this application also provides a face recognition generator generating apparatus for implementing the face recognition generator generating method described above. The solution provided by this apparatus is similar to the implementation described in the above method; therefore, the specific limitations in one or more face recognition generator generating apparatus embodiments provided below can be found in the limitations of the face recognition generator generating method described above, and will not be repeated here.
[0168] In one embodiment, such as Figure 7 As shown, a face recognition generator generation device is provided, including: a face feature recognition module 702, a classification recognition module 704, and a face recognition generator generation module 706, wherein:
[0169] The face feature recognition module 702 is used to perform face feature recognition on the training samples in the training sample set using the initial recognition network model obtained from the initial training, and to obtain the corresponding face features; the training samples include face images of the first category and face images of the second category.
[0170] The classification and recognition module 704 is used to classify and recognize facial features using the initial domain classifier obtained from the initial training, and generate the corresponding classification results.
[0171] The face recognition generator generation module 706 is used to perform adversarial training on the initial recognition network model and the initial domain classifier based on face features and classification results, and uses the initial recognition network model at the end of training as the trained face recognition generator.
[0172] In the aforementioned face recognition generator generation device, an initial recognition network model obtained through initial training is used to perform face feature recognition on training samples in the training sample set to obtain corresponding face features. Then, an initial domain classifier obtained through initial training is used to classify and recognize the face features, generating corresponding classification results. Furthermore, based on the face features and classification results, adversarial training can be performed on the initial recognition network model and the initial domain classifier, and the initial recognition network model at the end of training is used as the trained face recognizer. Because the recognition network model and domain classifier undergo adversarial training, the trained face recognizer can learn the common distribution among different categories of face images during the adversarial learning process. Therefore, based on face images with common distributions, data augmentation can be performed on face category images where training sample data is scarce. This augmented training sample data is then used to train a face recognizer for application scenarios, enabling accurate face recognition of face images in different practical application scenarios, determining the person to whom these face images belong, and thus improving the accuracy of face recognition.
[0173] In one embodiment, the face recognition generator module is also used for:
[0174] The classification results are superimposed on the initial recognition network model in reverse to update the initial recognition network model; the first parameter of the initial domain classifier is adjusted according to the face features output by the updated initial recognition network model; when the first iteration number of adversarial training is determined, or the joint loss function value of the initial recognition network model and the initial domain classifier meets the first training termination condition, the initial recognition network model at the end of training is used as the trained face recognizer.
[0175] In one embodiment, the face recognition generator module is also used for:
[0176] Based on the reverse effect of the classification results, the corresponding parameter adjustment strategy is determined; according to the determined parameter adjustment strategy, the second parameter of the initial recognition network model is adjusted within the corresponding parameter adjustment period to obtain the updated initial recognition network model.
[0177] In one embodiment, the face recognition generator module is also used for:
[0178] Obtain the current iteration number in the current training process, and determine the corresponding parameter adjustment period based on the current iteration number and the preset total iteration number; within the parameter adjustment period, adjust the second parameter of the initial recognition network model according to the determined parameter adjustment strategy to obtain the updated initial recognition network model.
[0179] In one embodiment, the face recognition generator module is also used for:
[0180] Based on the reverse effect of the classification results, the common distribution information between the first category face features corresponding to the first category label and the second category face features corresponding to the second category label is determined; based on the common distribution information, the corresponding parameter adjustment strategy is determined.
[0181] In one embodiment, a face recognition generator generation apparatus is provided, further comprising an initial recognition model generation module, used for:
[0182] The original recognition network model identifies the face feature map corresponding to each training sample in the training sample set and extracts the face features corresponding to the face feature map. Based on the face features and the label information of the training samples corresponding to the face feature map in the training sample set, the corresponding face recognition objective function value is determined. The original recognition network model is updated using gradient descent until the face recognition objective function value or the second iteration number of iterative training meets the second training stopping condition, thus obtaining the initial recognition network model after initial training.
[0183] In one embodiment, a face recognition generator generation apparatus is provided, further comprising an initial domain classifier generation module, used for:
[0184] Obtain the facial features output by the initial recognition network model; recognize the facial features using the original classifier and output the category label corresponding to each facial feature; determine the corresponding loss function value based on the recognized category label and the pre-labeled category label of the training samples corresponding to the facial features in the training sample set; update the original classifier using gradient descent until the loss function value or the third iteration of iterative training meets the third training stopping condition, thus obtaining the initial domain classifier after initial training.
[0185] Each module in the aforementioned face recognition generator can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the corresponding operations of each module.
[0186] In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as follows: Figure 8 As shown, the computer device includes a processor, memory, input / output (I / O) interfaces, and a communication interface. The processor, memory, and I / O interfaces are connected via a system bus, and the communication interface is also connected to the system bus via the I / O interfaces. The processor provides computational and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and a database. The internal memory provides the environment for the operating system and computer programs stored in the non-volatile storage media. The database stores data such as first-category face images, second-category face images, face features, classification results, an initial recognition network model, an initial domain classifier, and a trained face recognizer. The I / O interfaces are used for exchanging information between the processor and external devices. The communication interface is used for communication with external terminals via a network connection. When the computer program is executed by the processor, it implements a face recognizer generation method.
[0187] Those skilled in the art will understand that Figure 8 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.
[0188] In one embodiment, a computer device is also provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps in the above method embodiments.
[0189] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon that, when executed by a processor, implements the steps in the above method embodiments.
[0190] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the steps in the above method embodiments.
[0191] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of related data must comply with the relevant laws, regulations and standards of the relevant countries and regions.
[0192] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments described above. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.
[0193] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0194] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.
Claims
1. A method for generating a face recognition device, characterized in that, The method includes: The initial recognition network model obtained from the initial training is used to perform facial feature recognition on the training samples in the training sample set to obtain the corresponding facial features; the training samples include face images of the first category and face images of the second category; The initial domain classifier obtained from the initial training is used to classify and recognize the facial features, generating corresponding classification results. The classification results recognized by the initial domain classifier include a first category label and a second category label. The facial features include first category facial features corresponding to the first category label and second category facial features corresponding to the second category label. Based on the reverse effect of the classification results, the common distribution information between the first category face features corresponding to the first category label and the second category face features corresponding to the second category label is determined; the common distribution information is used to expand the training dataset corresponding to the small number of image categories in different image categories; Based on the common distribution information, a parameter adjustment strategy for the initial recognition network model is determined. According to the parameter adjustment strategy, the classification result is superimposed onto the initial recognition network model through an adversarial training layer to update the initial recognition network model. The adversarial effect is manifested in the parameter adjustment strategy using gradient ascent. Based on the facial features output by the updated initial recognition network model, the first parameter of the initial domain classifier is adjusted; When the first iteration number of the adversarial training is determined, or the joint loss function value of the initial recognition network model and the initial domain classifier satisfies the first training termination condition, the initial recognition network model at the end of training is used as the trained face recognizer.
2. The method according to claim 1, characterized in that, The step of adding the classification result back to the initial recognition network model to update the initial recognition network model includes: Based on the reverse effect of the classification results, determine the corresponding parameter adjustment strategy; According to the determined parameter adjustment strategy, the second parameter of the initial recognition network model is adjusted within the corresponding parameter adjustment period to obtain the updated initial recognition network model.
3. The method according to claim 2, characterized in that, The step of adjusting the second parameter of the initial recognition network model according to the determined parameter adjustment strategy within the corresponding parameter adjustment period to obtain an updated initial recognition network model includes: Obtain the current iteration number in the current training process, and determine the corresponding parameter adjustment period based on the current iteration number and the preset total iteration number; During the parameter adjustment period, the second parameter of the initial recognition network model is adjusted according to the determined parameter adjustment strategy to obtain the updated initial recognition network model.
4. The method according to any one of claims 1 to 3, characterized in that, The methods for training the initial recognition network model include: The original recognition network model is used to identify the facial feature map corresponding to each training sample in the training sample set, and the facial features corresponding to the facial feature map are extracted. Based on the facial features and the label information of the training samples in the training sample set that correspond to the facial feature map, the corresponding facial recognition objective function value is determined. The original recognition network model is updated using gradient descent until the face recognition objective function value or the second iteration number of iterative training meets the second training stopping condition, thus obtaining the initial recognition network model after initial training.
5. The method according to any one of claims 1 to 3, characterized in that, The methods for training the initial domain classifier include: Obtain the facial features output by the initial recognition network model; The facial features are identified using the original classifier, and category labels corresponding to each facial feature are output. Based on the identified category labels and the pre-labeled category labels of the training samples in the training sample set corresponding to the facial features, the corresponding loss function value is determined. The original classifier is updated using gradient descent until the loss function value or the third iteration of the training meets the third training stopping condition, thus obtaining the initial domain classifier after initial training.
6. A face recognition generator generating device, characterized in that, The device includes: The face feature recognition module is used to perform face feature recognition on training samples in the training sample set using the initial recognition network model obtained from the initial training, and to obtain the corresponding face features; the training samples include face images of the first category and face images of the second category; The classification and recognition module is used to classify and recognize the facial features using an initial domain classifier obtained from initial training, and generate corresponding classification results; the classification results recognized by the initial domain classifier include a first category label and a second category label, and the facial features include a first category facial feature corresponding to the first category label and a second category facial feature corresponding to the second category label; A face recognition generator generation module is configured to: determine, based on the reverse action of the classification result, a common distribution information between a first category face feature corresponding to the first category label and a second category face feature corresponding to the second category label; the common distribution information is used to expand the training dataset corresponding to the smaller number of image categories in different image categories; determine a parameter adjustment strategy for the initial recognition network model based on the common distribution information; and, based on the parameter adjustment strategy, superimpose the classification result onto the initial recognition network model through an adversarial training layer to update the initial recognition network model; the reverse action is manifested in the parameter adjustment strategy using gradient ascent; adjust the first parameter of the initial domain classifier based on the face features output by the updated initial recognition network model; and when the first iteration number of the adversarial training is determined, or the joint loss function value of the initial recognition network model and the initial domain classifier satisfies the first training termination condition, use the initial recognition network model at the end of training as the trained face recognizer.
7. The apparatus according to claim 6, characterized in that, The face recognition generator module is also used for: Based on the reverse effect of the classification results, a corresponding parameter adjustment strategy is determined; according to the determined parameter adjustment strategy, the second parameter of the initial recognition network model is adjusted within the corresponding parameter adjustment period to obtain an updated initial recognition network model.
8. The apparatus according to claim 7, characterized in that, The face recognition generator module is also used for: Obtain the current iteration number in the current training process, and determine the corresponding parameter adjustment period based on the current iteration number and the preset total iteration number; within the parameter adjustment period, adjust the second parameter of the initial recognition network model according to the determined parameter adjustment strategy to obtain the updated initial recognition network model.
9. The apparatus according to any one of claims 6 to 8, characterized in that, The device further includes an initial recognition model generation module, used for: The original recognition network model identifies the facial feature maps corresponding to each training sample in the training sample set and extracts the facial features corresponding to the facial feature maps. Based on the facial features and the label information of the training samples in the training sample set corresponding to the facial feature maps, the corresponding facial recognition objective function value is determined. The original recognition network model is updated using gradient descent until the facial recognition objective function value or the second iteration number of iterative training meets the second training stopping condition, thus obtaining the initial recognition network model after initial training.
10. The apparatus according to any one of claims 6 to 8, characterized in that, The apparatus further includes an initial domain classifier generation module, used for: Obtain the facial features output by the initial recognition network model; recognize the facial features using the original classifier and output the category label corresponding to each facial feature; determine the corresponding loss function value based on the recognized category label and the pre-labeled category label of the training samples corresponding to the facial features in the training sample set; update the original classifier using gradient descent until the loss function value or the third iteration of iterative training satisfies the third training stopping condition, thus obtaining the initial domain classifier after initial training.
11. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 5.
12. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 5.