Face recognition method and device, terminal and storage medium
By combining a discriminator and a multi-teacher model to train a target student network model, the problem of low accuracy in face recognition is solved, achieving higher recognition accuracy and efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NEUSOFT REACH AUTOMOBILE TECH (SHENYANG) CO LTD
- Filing Date
- 2023-02-28
- Publication Date
- 2026-06-23
Smart Images

Figure CN116311436B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of artificial intelligence technology, and more specifically, to a face recognition method, device, terminal, and storage medium. Background Technology
[0002] Facial recognition is a biometric technology that identifies individuals based on their facial features. It involves using cameras or webcams to capture images or video streams containing faces, automatically detecting and tracking faces within the images, and then performing facial recognition. This process is also commonly referred to as image recognition or facial identification. With the rapid development of technology, facial recognition technology is being used more and more widely, for example, for app logins.
[0003] Currently, the main approach to determining face categories in facial recognition is to integrate and migrate multimodal recognition models with different descriptive characteristics and classification capabilities into a new network model, and then use the new network model to determine the face category.
[0004] However, the above method cannot accurately identify the category of a face. Summary of the Invention
[0005] The main objective of this application is to provide a face recognition method, device, terminal, and storage medium to solve the problem of low face recognition accuracy in related technologies.
[0006] To achieve the above objectives, firstly, this application provides a face recognition method, comprising:
[0007] Acquire the target face image;
[0008] The target face image is input into the target student network model, and the output is the category probability vector corresponding to the target face image. The target student network model is trained based on the target network model, and the target network model combines a discriminator and a multi-teacher model.
[0009] The category with the highest probability is selected from the category probability vector, and the category corresponding to the highest category probability is taken as the category of the target face image.
[0010] In one possible implementation, before inputting the target face image into the target student network model and outputting the category probability vector corresponding to the target face image, the following steps are also included:
[0011] The target network model is trained to obtain the target student network model.
[0012] In one possible implementation, the target network model is trained to obtain the target student network model, including:
[0013] Obtain initial face images and an initial network model, wherein the initial network model combines a discriminator and a multi-teacher model;
[0014] The discriminator and multi-teacher model in the initial network model are iteratively trained using the initial face images until the preset number of iterations is reached, at which point the target network model is obtained.
[0015] The student network model in the target network model is used as the target student network model.
[0016] In one possible implementation, the discriminator and multi-teacher model in the initial network model are iteratively trained using the initial face image. After a preset number of iterations, the target network model is obtained, including:
[0017] The initial face image is input into the multi-teacher model, and the output is the category probability vector and feature map corresponding to the initial face image;
[0018] The discriminator is trained using the category probability vector and feature map corresponding to the initial face image, and the configuration parameters of the discriminator are determined.
[0019] The multi-teacher model is trained using initial face images to determine the configuration parameters of the multi-teacher model;
[0020] Return to the previous step of inputting the initial face image into the multi-teacher model and outputting the category probability vector and feature map corresponding to the initial face image, until the configuration parameters of the multi-teacher model and the discriminator meet the preset conditions, and obtain the target network model.
[0021] In one possible implementation, the initial face image is input into a multi-teacher model, which outputs a category probability vector and feature map corresponding to the initial face image, including:
[0022] The initial face image is input into the student network in the multi-teacher model, and the first category probability vector and feature map corresponding to the initial face image are output.
[0023] The initial face image is input into n teacher networks in the multi-teacher model to obtain n second-class probability vectors corresponding to the initial face image, wherein the n teacher networks correspond one-to-one with the n second-class probability vectors;
[0024] An initial face image is input into n teacher networks, each with n auxiliary networks, resulting in n feature maps output by the n auxiliary networks. The n auxiliary networks, n teacher networks, and n feature maps are then processed. Figure 1 One-to-one correspondence, where n is an integer greater than 2.
[0025] In one possible implementation, the discriminator is trained using the category probability vector and feature map corresponding to the initial face image, and the configuration parameters of the discriminator are determined, including:
[0026] Use n second-class probability vectors and n feature maps as positive samples, and use the first-class probability vectors and feature maps as negative samples;
[0027] The discriminator is trained using positive and negative samples, and its configuration parameters are updated.
[0028] In one possible implementation, the multi-teacher model is trained using initial face images to determine the configuration parameters of the multi-teacher model, including:
[0029] The student network and n auxiliary networks in the multi-teacher model are trained using the initial face images, and the configuration parameters of the student network and n auxiliary networks are updated.
[0030] Secondly, embodiments of the present invention provide a face recognition device, comprising:
[0031] The acquisition module is used to acquire the target face image;
[0032] The model prediction module is used to input the target face image into the target student network model and output the category probability vector corresponding to the target face image. The target student network model is trained based on the target network model, and the target network model combines a discriminator and a multi-teacher model.
[0033] The category determination module is used to select the category with the highest probability from the category probability vector and take the category corresponding to the highest category probability as the category of the target face image.
[0034] Thirdly, embodiments of the present invention provide a terminal, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of any of the above-mentioned face recognition methods.
[0035] Fourthly, embodiments of the present invention provide a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps of any of the above-described face recognition methods.
[0036] This invention provides a face recognition method, apparatus, terminal, and storage medium, comprising: acquiring a target face image; inputting the target face image into a target student network model; and outputting a category probability vector corresponding to the target face image. The target student network model is trained based on a target network model, which combines a discriminator and a multi-teacher model. The model then selects the category with the highest probability from the category probability vector and assigns the category corresponding to the highest probability as the category of the target face image. This invention, by combining a discriminator and a multi-teacher model, significantly improves the feature representation and generalization capabilities of the target student network model, thereby increasing the accuracy and efficiency of category recognition for the target face image. Attached Figure Description
[0037] The accompanying drawings, which form part of this application, are used to provide a further understanding of the application and to make other features, objects, and advantages of the application more apparent. The illustrative embodiments and descriptions of this application are used to explain the application and do not constitute an undue limitation of the application. In the drawings:
[0038] Figure 1 This is a flowchart illustrating the implementation of a face recognition method provided in an embodiment of the present invention;
[0039] Figure 2 This is a schematic diagram of the structure of a network model provided in an embodiment of the present invention;
[0040] Figure 3 This is a schematic diagram of the structure of a face recognition device provided in an embodiment of the present invention;
[0041] Figure 4 This is a schematic diagram of the terminal provided in an embodiment of the present invention. Detailed Implementation
[0042] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0043] The terms "first," "second," "third," "fourth," etc. (if present) in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that embodiments of the invention described herein can be implemented in sequences other than those illustrated or described herein.
[0044] It should be understood that in the various embodiments of the present invention, the sequence number of each process does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
[0045] It should be understood that in this invention, "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product, or device that includes a series of steps or units is not necessarily limited to those steps or units that are explicitly listed, but may include other steps or units that are not explicitly listed or that are inherent to such process, method, product, or device.
[0046] It should be understood that in this invention, "multiple" refers to two or more. "And / or" is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, "and / or B" can represent: A existing alone, A and B existing simultaneously, and B existing alone. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "Contains A, B, and C", "Contains A, B, and C" means that all three A, B, and C are contained; "Contains A, B, or C" means that one of A, B, and C is contained; "Contains A, B, and / or C" means that any one, two, or three of A, B, and C are contained.
[0047] It should be understood that in this invention, "B corresponding to A", "B corresponding to A", "A and B correspond", or "B and A correspond" means that B is associated with A, and B can be determined based on A. Determining B based on A does not mean determining B solely based on A; B can also be determined based on A and / or other information. Matching A and B is defined as a similarity between A and B that is greater than or equal to a preset threshold.
[0048] Depending on the context, "if" as used here can be interpreted as "when," "when," "in response to determination," or "in response to detection."
[0049] The technical solution of the present invention will be described in detail below with reference to specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.
[0050] To make the objectives, technical solutions, and advantages of the present invention clearer, specific embodiments will be described below in conjunction with the accompanying drawings.
[0051] Deep learning has developed rapidly in recent years and has been widely used in the field of autonomous driving. However, the computing power of computing chips on the vehicle platform is limited. Therefore, in order to meet the real-time computing requirements of the platform, large deep convolutional networks such as ResNet t50 and ResNet 100 cannot be successfully applied to the vehicle platform. Only some lightweight networks such as MobileNet series and MobileFaceNet can be used.
[0052] However, past experience and experiments have shown that larger deep convolutional networks generally learn better feature representations and have stronger generalization capabilities compared to lightweight networks. Therefore, in recent years, many studies have used larger deep convolutional networks, after being fully trained, as teacher models to guide the training of lightweight networks (student models), thereby improving the feature representation and generalization capabilities of the lightweight models. This method is also known as distillation.
[0053] Existing distillation methods typically employ two approaches to design the distillation loss function. The first calculates the vector distance between the outputs of the teacher and student models, minimizing this distance to make the student model's output closer to the teacher model's. The second approach calculates the distance between feature maps of certain intermediate layers in the teacher and student models, minimizing this distance to make the student model's feature maps as close as possible to the teacher model's. Both approaches generally use a single teacher model for guidance. Some studies suggest using multiple teacher models; however, these studies often simply average the outputs of multiple teacher models before training the student model. This can potentially lead to the loss of unique characteristics among the teacher models, failing to achieve the best guidance effect.
[0054] This application not only calculates the distance between the outputs or intermediate features of the teacher and student models, but also strives to make their outputs and intermediate features as close and similar as possible. Furthermore, multiple teacher models guide the training of the student models relatively independently, thus better leveraging the advantages of a multi-teacher expert team.
[0055] In one embodiment, such as Figure 1 As shown, a face recognition method is provided, including the following steps:
[0056] Step S101: Obtain the target face image;
[0057] Step S102: Input the target face image into the target student network model and output the category probability vector corresponding to the target face image.
[0058] The target face image can be in any format, such as JPG or PNG.
[0059] The target student network model is trained based on the target network model, which combines a discriminator and a multi-teacher model. For example... Figure 2 As shown, the target network model includes a multi-teacher model and discriminators. The multi-teacher model includes multiple teacher networks and one student network, while there are multiple discriminators, namely, discriminators D1, D2, d1, and d2 are used to identify the outputs of multiple teacher networks and one student network.
[0060] Before inputting the target face image into the target student network model and outputting the category probability vector corresponding to the target face image, the target network model needs to be trained to obtain the target student network model.
[0061] The process of training the target network model to obtain the target student network model requires first acquiring initial face images and an initial network model. Then, the discriminator and multi-teacher model within the initial network model are iteratively trained using the initial face images until a preset number of iterations is reached, resulting in the target network model. This preset number of iterations can be set according to specific circumstances and is not specifically limited here. The student network model within the target network model is then used as the target student network model. The initial network model combines the discriminator and multi-teacher model; that is, the initial network model and the target network model have the same structure, differing only in their configuration parameters.
[0062] The method involves iteratively training the discriminator and multi-teacher model in an initial network model using an initial face image until a preset number of iterations are reached, resulting in the target network model. This process includes: inputting the initial face image into the multi-teacher model, outputting the corresponding class probability vector and feature map; training the discriminator using the class probability vector and feature map to determine its configuration parameters; training the multi-teacher model again using the initial face image to determine its configuration parameters; and then returning to the previous steps of inputting the initial face image into the multi-teacher model and outputting the corresponding class probability vector and feature map, until the configuration parameters of the multi-teacher model and discriminator meet preset conditions, thus obtaining the target network model.
[0063] The process involves inputting an initial face image into a multi-teacher model and outputting corresponding category probability vectors and feature maps. This includes: inputting the initial face image into the student network of the multi-teacher model to output a first-category probability vector and feature map; inputting the initial face image into n teacher networks of the multi-teacher model to obtain n second-category probability vectors corresponding to the initial face image, where each of the n teacher networks corresponds one-to-one with one of the n second-category probability vectors; and inputting the initial face image into n teacher networks with n auxiliary networks to obtain n feature maps output by the n auxiliary networks, where each of the n auxiliary networks, n teacher networks, and n feature maps corresponds to one of the n feature maps. Figure 1 One-to-one correspondence, where n is an integer greater than 2.
[0064] Since multiple teacher models consist of n teacher networks and one student network, it is necessary to train the n teacher networks and one student network using initial face images to obtain output data, which is then used to train the discriminator. The output data includes three parts: the first part is the first-class probability vector and feature map output by training the student network using the initial face image; the second part is the n second-class probability vectors output by training each of the n teacher networks using the initial face image; and the third part is the feature maps output by training the n teacher networks with auxiliary networks using the initial face image.
[0065] This application does not simply average the results and feature maps of multiple teacher models for guidance, but rather each teacher network guides the student network training relatively independently, maximizing the effectiveness of the multi-teacher model expert team.
[0066] After obtaining the data for training the discriminator, the discriminator is trained using the first-class probability vector, feature map, n second-class probability vectors, and n feature maps to determine the discriminator's configuration parameters. The specific training process is as follows:
[0067] The discriminator is trained using n second-class probability vectors and n feature maps as positive samples, and the first-class probability vectors and feature maps as negative samples. The discriminator's configuration parameters are then updated using the positive and negative samples.
[0068] Once the discriminator's configuration parameters are determined, the multi-teacher model needs to be trained using the initial face images. Specifically, the student network and n auxiliary networks in the multi-teacher model are trained using the initial face images, and the configuration parameters of the student network and n auxiliary networks are updated.
[0069] After updating the configuration parameters of the discriminator and the multi-teacher model, if the configuration parameters of the student network in the multi-teacher model still do not meet the requirements, it is necessary to return to the step of inputting the initial face image into the multi-teacher model and outputting the category probability vector and feature map corresponding to the initial face image, until the configuration parameters of the multi-teacher model and the discriminator meet the preset conditions and the target network model is obtained.
[0070] This application uses a discriminator to replace the simple distance calculation in the original algorithm. The discriminator determines whether the output class probability vector or feature map is generated by the teacher network or the student network in the multi-teacher model. By drawing on the training method of generative adversarial learning networks, the discriminator and the student network are trained alternately, so that the output results and feature maps of the student network are as close as possible to the output results and intermediate layer feature maps of the teacher network, thereby greatly improving the feature representation ability and generalization ability of the student network.
[0071] Step S103: Select the category with the highest probability from the category probability vector, and take the category corresponding to the highest category probability as the category of the target face image.
[0072] After predicting the target face image through the target student network model, multiple category probability vectors are output. For example, the probability vector for category 1 is 2, the probability vector for category 2 is 3, the probability vector for category 3 is 4.5, etc.
[0073] If the category probability vectors only include category 1 with probability vector 2, category 2 with probability vector 3, and category 3 with probability vector 4.5, then we only need to select the category corresponding to the largest category probability 4.5, i.e., category 3, which is the category of the target face image.
[0074] This invention provides a face recognition method, comprising: acquiring a target face image; inputting the target face image into a target student network model; and outputting a category probability vector corresponding to the target face image. The target student network model is trained based on a target network model, which combines a discriminator and a multi-teacher model. The method then selects the category with the highest probability from the category probability vector and assigns the category corresponding to the highest probability as the category of the target face image. This invention, by combining a discriminator and a multi-teacher model, significantly improves the feature representation and generalization capabilities of the target student network model, thereby increasing the accuracy and efficiency of category recognition for the target face image.
[0075] It should be understood that the sequence number of each step in the above embodiments does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
[0076] The following are device embodiments of the present invention. For details not described in detail, please refer to the corresponding method embodiments described above.
[0077] Figure 3 The diagram illustrates the structure of a face recognition device according to an embodiment of the present invention. For ease of explanation, only the parts relevant to the embodiment are shown. The face recognition device includes an acquisition module 301, a model prediction module 302, and a category determination module 303, as detailed below:
[0078] The acquisition module 301 is used to acquire the target face image;
[0079] The model prediction module 302 is used to input the target face image into the target student network model and output the category probability vector corresponding to the target face image. The target student network model is trained based on the target network model, and the target network model combines a discriminator and a multi-teacher model.
[0080] The category determination module 303 is used to select the category with the highest category probability from the category probability vector and take the category corresponding to the highest category probability as the category of the target face image.
[0081] In one possible implementation, the model prediction module 302 is preceded by a model training module, which is used to train the target network model to obtain the target student network model.
[0082] In one possible implementation, the model training module is also used to acquire initial face images and an initial network model, wherein the initial network model combines a discriminator and a multi-teacher model.
[0083] The discriminator and multi-teacher model in the initial network model are iteratively trained using the initial face images until the preset number of iterations is reached, at which point the target network model is obtained.
[0084] The student network model in the target network model is used as the target student network model.
[0085] In one possible implementation, the model training module is also used to input the initial face image into the multi-teacher model and output the category probability vector and feature map corresponding to the initial face image;
[0086] The discriminator is trained using the category probability vector and feature map corresponding to the initial face image, and the configuration parameters of the discriminator are determined.
[0087] The multi-teacher model is trained using initial face images to determine the configuration parameters of the multi-teacher model;
[0088] Return to the previous step of inputting the initial face image into the multi-teacher model and outputting the category probability vector and feature map corresponding to the initial face image, until the configuration parameters of the multi-teacher model and the discriminator meet the preset conditions, and obtain the target network model.
[0089] In one possible implementation, the model training module is also used to input the initial face image into the student network in the multi-teacher model and output the first category probability vector and feature map corresponding to the initial face image;
[0090] The initial face image is input into n teacher networks in the multi-teacher model to obtain n second-class probability vectors corresponding to the initial face image, wherein the n teacher networks correspond one-to-one with the n second-class probability vectors;
[0091] An initial face image is input into n teacher networks, each with n auxiliary networks, resulting in n feature maps output by the n auxiliary networks. The n auxiliary networks, n teacher networks, and n feature maps are then processed. Figure 1 One-to-one correspondence, where n is an integer greater than 2.
[0092] In one possible implementation, the model training module is also used to treat n second-class probability vectors and n feature maps as positive samples, and the first-class probability vectors and feature maps as negative samples.
[0093] The discriminator is trained using positive and negative samples, and its configuration parameters are updated.
[0094] In one possible implementation, the model training module is also used to train the student network and n auxiliary networks in the multi-teacher model using the initial face images, and to update the configuration parameters of the student network and the n auxiliary networks.
[0095] This invention provides a face recognition device, comprising: acquiring a target face image; inputting the target face image into a target student network model; and outputting a category probability vector corresponding to the target face image. The target student network model is trained based on a target network model, which combines a discriminator and a multi-teacher model. The device then selects the category with the highest probability from the category probability vector and assigns the category corresponding to the highest probability as the category of the target face image. This invention, by combining a discriminator and a multi-teacher model, significantly improves the feature representation and generalization capabilities of the target student network model, thereby increasing the accuracy and efficiency of category recognition for the target face image.
[0096] Figure 4 This is a schematic diagram of a terminal provided in an embodiment of the present invention. Figure 4As shown, the terminal 4 in this embodiment includes: a processor 41, a memory 42, and a computer program 43 stored in the memory 42 and executable on the processor 41. When the processor 41 executes the computer program 43, it implements the steps described in the various face recognition method embodiments above, for example... Figure 1 Steps 101 to 103 are shown. Alternatively, when processor 41 executes computer program 43, it implements the functions of each module / unit in the above-described face recognition device embodiments, for example... Figure 3 The functions of modules / units 301 to 303 shown.
[0097] The present invention also provides a readable storage medium storing a computer program, which, when executed by a processor, is used to implement the face recognition method provided in the various embodiments described above.
[0098] The readable storage medium can be a computer storage medium or a communication medium. A communication medium includes any medium that facilitates the transfer of computer programs from one location to another. A computer storage medium can be any available medium accessible to a general-purpose or special-purpose computer. For example, a readable storage medium is coupled to a processor, enabling the processor to read information from and write information to the readable storage medium. Of course, the readable storage medium can also be a component of the processor. The processor and the readable storage medium can reside in an Application-Specific Integrated Circuit (ASIC). Alternatively, the ASIC can be located in a user device. Of course, the processor and the readable storage medium can also exist as discrete components in a communication device. The readable storage medium can be a read-only memory (ROM), random access memory (RAM), CD-ROM, magnetic tape, floppy disk, and optical data storage device, etc.
[0099] The present invention also provides a program product including execution instructions stored in a readable storage medium. At least one processor of the device can read the execution instructions from the readable storage medium, and the execution of the execution instructions by the at least one processor causes the device to implement the face recognition methods provided in the various embodiments described above.
[0100] In the embodiments of the above-described device, it should be understood that the processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in this invention can be directly manifested as being executed by a hardware processor, or executed by a combination of hardware and software modules within the processor.
[0101] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should all be included within the protection scope of the present invention.
Claims
1. A face recognition method, characterized in that, include: Acquire the target face image; The target face image is input into the target student network model, and the category probability vector corresponding to the target face image is output. The target student network model is trained based on the target network model, and the target network model combines a discriminator and a multi-teacher model. The category with the highest probability is selected from the category probability vector, and the category corresponding to the highest category probability is taken as the category of the target face image; Before inputting the target face image into the target student network model and outputting the category probability vector corresponding to the target face image, the method further includes: The target network model is trained to obtain the target student network model; The process of training the target network model to obtain the target student network model includes: Obtain an initial face image and an initial network model, wherein the initial network model combines a discriminator and a multi-teacher model; The discriminator and multi-teacher model in the initial network model are iteratively trained using the initial face image until the preset number of iterations is reached, at which point the target network model is obtained. The student network model in the target network model is taken as the target student network model; The process involves iteratively training the discriminator and multi-teacher model in the initial network model using the initial face image until a preset number of iterations are reached, resulting in the target network model, including: The initial face image is input into the multi-teacher model, and the class probability vector and feature map corresponding to the initial face image are output. The discriminator is trained using the category probability vector and feature map corresponding to the initial face image to determine the configuration parameters of the discriminator; The multi-teacher model is trained using the initial face image to determine the configuration parameters of the multi-teacher model; Return to the previous step of inputting the initial face image into the multi-teacher model and outputting the category probability vector and feature map corresponding to the initial face image, until the configuration parameters of the multi-teacher model and the discriminator meet the preset conditions, and obtain the target network model.
2. The face recognition method as described in claim 1, characterized in that, The step of inputting the initial face image into the multi-teacher model and outputting the category probability vector and feature map corresponding to the initial face image includes: The initial face image is input into the student network in the multi-teacher model, and the first category probability vector and feature map corresponding to the initial face image are output. The initial face image is input into n teacher networks in the multi-teacher model to obtain n second-class probability vectors corresponding to the initial face image, wherein the n teacher networks correspond one-to-one with the n second-class probability vectors; The initial face image is input into n teacher networks with n auxiliary networks to obtain n feature maps output by the n auxiliary networks. The n auxiliary networks, n teacher networks and the n feature maps are in one-to-one correspondence, and n is an integer greater than 2.
3. The face recognition method as described in claim 2, characterized in that, The process of training the discriminator using the category probability vector and feature map corresponding to the initial face image to determine the configuration parameters of the discriminator includes: The n second-class probability vectors and the n feature maps are used as positive samples, and the first-class probability vectors and feature maps are used as negative samples. The discriminator is trained using the positive and negative samples, and the configuration parameters of the discriminator are updated.
4. The face recognition method as described in claim 3, characterized in that, The step of training the multi-teacher model using the initial face image to determine the configuration parameters of the multi-teacher model includes: The student network and n auxiliary networks in the multi-teacher model are trained using the initial face image, and the configuration parameters of the student network and n auxiliary networks are updated.
5. A face recognition device, characterized in that, include: The acquisition module is used to acquire the target face image; The model prediction module is used to input the target face image into the target student network model and output the category probability vector corresponding to the target face image. The target student network model is trained based on the target network model, and the target network model combines a discriminator and a multi-teacher model. The category determination module is used to select the category with the highest category probability from the category probability vector and take the category corresponding to the highest category probability as the category of the target face image; The model prediction module is preceded by a model training module, which is used to train the target network model to obtain the target student network model. The training module is further configured to: acquire an initial face image and an initial network model, wherein the initial network model combines a discriminator and a multi-teacher model; iteratively train the discriminator and multi-teacher model in the initial network model using the initial face image until a preset number of iterations are reached to obtain the target network model; and use the student network model in the target network model as the target student network model. The training module is further configured to: input the initial face image into the multi-teacher model, and output the category probability vector and feature map corresponding to the initial face image; train the discriminator using the category probability vector and feature map corresponding to the initial face image to determine the configuration parameters of the discriminator; train the multi-teacher model using the initial face image to determine the configuration parameters of the multi-teacher model; return to the step of inputting the initial face image into the multi-teacher model and outputting the category probability vector and feature map corresponding to the initial face image, until the configuration parameters of the multi-teacher model and the discriminator meet the preset conditions, thereby obtaining the target network model.
6. A terminal, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the face recognition method as described in any one of claims 1 to 4.
7. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by the processor, it implements the steps of the face recognition method as described in any one of claims 1 to 4.