Feature extraction model training method and related apparatus

By acquiring a group of face images from the same source, calculating similarity and loss, and performing feature mapping, the problem of insufficient feature representation and sample discrimination ability in unsupervised learning is solved, and the clustering accuracy of the feature extraction model is improved.

CN115713790BActive Publication Date: 2026-06-26TENCENT TECHNOLOGY (SHENZHEN) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
TENCENT TECHNOLOGY (SHENZHEN) CO LTD
Filing Date
2021-08-20
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing unsupervised learning methods suffer from weak feature representation and sample discrimination capabilities in neural network models, resulting in low clustering accuracy.

Method used

By acquiring a group of face images from the same source, feature extraction is performed using a feature extraction model. The similarity between the same source and non-same source is calculated, as well as the instance loss and cluster loss. Feature mapping is then performed to obtain the classification probability distribution, and the model is trained based on the loss.

Benefits of technology

It improves the clustering accuracy of the feature extraction model and enhances the feature representation ability and the ability to distinguish between positive and negative samples.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115713790B_ABST
    Figure CN115713790B_ABST
Patent Text Reader

Abstract

The embodiment of the application provides a feature extraction model training method and related devices, which can obtain a homologous face image group corresponding to an original face image; a feature extraction model is used to perform feature extraction on the homologous face image group to obtain feature information; an instance loss of the original face image is calculated according to the face image feature information; a cluster loss of the original face image is calculated according to a first classification probability distribution and a second classification probability distribution obtained by performing feature mapping processing on the face image feature information; and the feature extraction model is trained according to the instance loss and the cluster loss. Since the feature extraction model can mine the internal correlation and data rules of sample data through the instance loss and the cluster loss, the feature extraction model has stronger feature expression capability and positive and negative sample distinguishing capability, thereby improving the clustering accuracy of the feature extraction model.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer technology, specifically to a feature extraction model training method and related apparatus, the related apparatus including a feature extraction model training device, a computer device, and a computer-readable storage medium. Background Technology

[0002] With the development of computer vision technology, labeled supervised learning methods have made rapid progress in recent decades. However, labeled supervised learning methods rely excessively on manually labeled data, which is time-consuming and labor-intensive. Moreover, as the amount of labeled data increases, the performance improvement of supervised learning exhibits a diminishing marginal effect.

[0003] Unsupervised learning methods can effectively eliminate the dependence on labeled data. Current unsupervised learning generally adopts unsupervised clustering methods. Most unsupervised clustering methods are traditional machine learning methods that use custom distance metrics, or simply introduce deep learning methods. However, existing unsupervised clustering methods have problems with weak feature representation and sample discrimination capabilities for neural network models, resulting in low clustering accuracy of neural network models. Summary of the Invention

[0004] This application provides a feature extraction model training method and related apparatus. The related apparatus includes a feature extraction model training device, a computer device, and a computer-readable storage medium, which can improve the accuracy of feature extraction model clustering.

[0005] A method for training a feature extraction model, comprising:

[0006] Obtain a group of source face images corresponding to the original face image. The group of source face images includes at least two source face images of the original face image.

[0007] A feature extraction model is used to extract features from face images in a group of face images from the same source to obtain a feature information set, which includes the face image feature information corresponding to the original face image;

[0008] Calculate the homology similarity between homologous face images and the non-homologous similarity between homologous face images and non-homologous face images based on the feature information of the face images.

[0009] The instance loss of the original face image is calculated based on the homology similarity and the non-homology similarity. The instance loss represents the difference loss between the homology similarity and the non-homology similarity of the original face image.

[0010] Feature mapping is performed based on the feature information of the face image to obtain the first classification probability distribution of the same face image in the classification category, and the second classification probability distribution of the same face image in the classification category.

[0011] The cluster loss of the original face image is calculated based on the first classification probability distribution and the second classification probability distribution. The cluster loss characterizes the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image.

[0012] The feature extraction model is trained based on instance loss and cluster loss.

[0013] Accordingly, embodiments of this application provide a feature extraction model training apparatus, comprising:

[0014] The acquisition unit can be used to acquire a group of homologous face images corresponding to the original face image, wherein the group of homologous face images includes at least two homologous face images of the original face image.

[0015] The extraction unit can be used to extract features from face images in a group of face images from the same source using a feature extraction model, and obtain a feature information set, which includes the face image feature information corresponding to the original face image;

[0016] The first calculation unit can be used to calculate the homology similarity between homologous face images of the original face image and the non-homologous similarity between homologous face images and non-homologous face images based on the face image feature information.

[0017] The second calculation unit can be used to calculate the instance loss of the original face image based on homologous similarity and non-homologous similarity. The instance loss represents the difference loss between the homologous similarity of the original face image and the non-homologous similarity of the original face image.

[0018] The mapping unit can be used to perform feature mapping processing based on the feature information of face images to obtain the first classification probability distribution of the same face images in the classification category, and the second classification probability distribution of the same face images in the classification category.

[0019] The third calculation unit can be used to calculate the cluster loss of the original face image based on the first classification probability distribution and the second classification probability distribution. The cluster loss characterizes the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image.

[0020] The training unit can be used to train the feature extraction model based on instance loss and cluster loss.

[0021] In some embodiments, the feature extraction model training device further includes a clustering unit, which can be used to cluster homologous face images based on face image feature information to obtain cluster labels and the classification probabilities of homologous face images at the corresponding cluster labels; and calculate the classification loss of different original face images at the target cluster labels based on the classification probabilities.

[0022] Specifically, the training unit can be used to train the feature extraction model based on instance loss, cluster loss, and classification loss.

[0023] In some embodiments, the clustering unit can be used to cluster homologous face images based on face image feature information to obtain cluster labels; and to match the cluster labels with the classification categories to obtain the classification probability of homologous face images in the cluster labels.

[0024] In some embodiments, the clustering unit can be used to cluster homologous face images based on face image feature information to obtain cluster labels; and to match the cluster labels with the classification categories to obtain the classification probability of homologous face images in the cluster labels.

[0025] In some embodiments, the mapping unit can be specifically used to classify homologous face images according to face image feature information to obtain an initial classification probability distribution of homologous face images in each classification category; and to split the initial classification probability distribution to obtain a first classification probability distribution of the first homologous face image in each classification category and a second classification probability distribution of the second homologous face image in each classification category.

[0026] In some embodiments, the third calculation unit may be specifically configured to: calculate first distribution difference information of the original face image in the target classification category based on the first classification probability distribution and the second classification probability distribution of the target classification category; calculate second distribution difference information of the original face image between the target classification category and the non-target classification category based on the first classification probability distribution, the second classification probability distribution of the target classification category, the first classification probability distribution of the non-target classification category, and the second classification probability distribution of the non-target classification category; calculate a first cluster loss of the original face image based on the first and second distribution difference information, wherein the first cluster loss characterizes the difference loss between the first and second classification probability distributions of the original face image in the target classification category; and generate a cluster loss based on the first cluster loss.

[0027] In some embodiments, the third computing unit may further be used to calculate the second cluster loss of the original face image based on the first classification probability distribution of the first homologous face image in each classification category and the second classification probability distribution of the second homologous face image in each classification category. The second cluster loss characterizes the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image in different classification categories; and generates a cluster loss based on the second cluster loss.

[0028] In some embodiments, the first calculation unit may be specifically used to calculate the homology similarity between homology face images of the original face image based on the distance between the facial feature information of homology face images in the mapping space; and to calculate the non-homology similarity between homology face images and non-homology face images based on the distance between the facial image feature information of homology face images and the facial image feature information of non-homology face images in the mapping space.

[0029] Furthermore, embodiments of this application also provide a computer device, including a memory and a processor; the memory stores a computer program, and the processor is used to run the computer program in the memory to execute any of the feature extraction model training methods provided in embodiments of this application.

[0030] Furthermore, embodiments of this application also provide a computer-readable storage medium storing a computer program adapted for loading by a processor to execute any of the feature extraction model training methods provided in embodiments of this application.

[0031] This application embodiment can obtain a group of homologous face images corresponding to the original face image. The homologous face image group includes at least two homologous face images of the original face image. A feature extraction model is used to extract features from the face images in the homologous face image group to obtain a feature information set, which includes the face image feature information corresponding to the original face image. Based on the face image feature information, the homologous similarity between homologous face images of the original face image and the non-homologous similarity between homologous face images and non-homologous face images are calculated. Based on the homologous similarity and non-homologous similarity, the instance loss of the original face image is calculated. The loss represents the difference between the homologous similarity and the non-homologous similarity of the original face images. Feature mapping is performed based on the face image feature information to obtain the first and second classification probability distributions of homologous face images in different classification categories. The cluster loss of the original face images is calculated based on the first and second classification probability distributions, representing the difference between the first and second classification probability distributions. The feature extraction model is trained based on the instance loss and cluster loss. Because the feature extraction model in this embodiment can mine the inherent correlation and data patterns of sample data through instance loss and cluster loss—that is, the inherent correlation and data patterns of homologous face image groups of the original face images and homologous face image groups corresponding to different original face images—the feature extraction model has stronger feature expression capabilities and positive / negative sample discrimination capabilities, thereby improving the accuracy of clustering in the feature extraction model. Attached Figure Description

[0032] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0033] Figure 1 This is a schematic diagram of a scenario for the feature extraction model training method provided in an embodiment of this application;

[0034] Figure 2 This is a flowchart illustrating the feature extraction model training method provided in an embodiment of this application;

[0035] Figure 3 This is a schematic diagram of the process for calculating the first classification probability distribution and the second classification probability distribution in an embodiment of this application;

[0036] Figure 4This is a schematic diagram illustrating the calculation of cluster loss of the original face image based on the first classification probability distribution and the second classification probability distribution provided in an embodiment of this application.

[0037] Figure 5 These are two schematic diagrams illustrating the calculation of cluster loss of the original face image based on the first classification probability distribution and the second classification probability distribution provided in the embodiments of this application;

[0038] Figure 6 This is a schematic diagram illustrating the calculation of classification loss for the original face image provided in an embodiment of this application;

[0039] Figure 7 This is a schematic diagram illustrating the classification probabilities of cluster labels and homologous face images corresponding to cluster labels obtained in an embodiment of this application.

[0040] Figure 8 These are two schematic flowcharts illustrating the feature extraction model training method provided in this application embodiment;

[0041] Figure 9 These are three schematic flowcharts illustrating the feature extraction model training method provided in this application embodiment;

[0042] Figure 10 This is a schematic diagram illustrating the training of a feature extraction model based on the total loss function provided in an embodiment of this application;

[0043] Figure 11 This is a schematic diagram illustrating the testing of a trained feature extraction model according to an embodiment of this application;

[0044] Figure 12 These are two schematic diagrams illustrating the testing of a trained feature extraction model according to an embodiment of this application;

[0045] Figure 13 This is a schematic diagram of the structure of the feature extraction model training device provided in the embodiments of this application;

[0046] Figure 14 This is a schematic diagram of the structure of the computer device provided in the embodiments of this application. Detailed Implementation

[0047] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0048] This application provides a feature extraction model training method and related apparatus. The related apparatus includes a feature extraction model training device, a computer device, and a computer-readable storage medium. The feature extraction model training device can be integrated into the computer device, which can be a server or a terminal, etc.

[0049] The server can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery network (CDN), and big data and artificial intelligence platforms. The terminal can be a smartphone, tablet, laptop, desktop computer, smart speaker, smartwatch, etc., but is not limited to these. The terminal and server can be directly or indirectly connected via wired or wireless communication, which is not limited herein.

[0050] This application relates to Artificial Intelligence (AI), which is the theory, method, technology, and application system that uses digital computers or machines controlled by digital computers to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use that knowledge to obtain optimal results. In other words, AI is a comprehensive technology within computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a way similar to human intelligence. AI studies the design principles and implementation methods of various intelligent machines, enabling them to possess perception, reasoning, and decision-making capabilities.

[0051] Artificial intelligence (AI) is a comprehensive discipline encompassing a wide range of fields, including both hardware and software technologies. Fundamental AI technologies generally include sensors, dedicated AI chips, cloud computing, distributed storage, big data processing, operating / interactive systems, and mechatronics. AI software technologies primarily include computer vision, speech processing, natural language processing, and machine learning / deep learning.

[0052] For example, see Figure 1Taking a feature extraction model training device integrated into a computer device, specifically a smartphone, as an example, the smartphone acquires a group of homologous face images corresponding to the original face image. This group includes at least two homologous face images from the original face image. The feature extraction model is used to extract features from the face images in the homologous face image group, resulting in a feature information set. This feature information set includes the face image feature information corresponding to the original face image. Based on the face image feature information, the homologous similarity between homologous face images and the non-homologous similarity between homologous and non-homologous face images are calculated. The homologous similarity and non-homologous similarity are then used to calculate... The instance loss of the original face image is used to characterize the difference loss between the homologous similarity and the non-homologous similarity of the original face image. Feature mapping is performed based on the face image feature information to obtain the first classification probability distribution and the second classification probability distribution of homologous face images in the classification category. The cluster loss of the original face image is calculated based on the first and second classification probability distributions, and the cluster loss characterizes the difference loss between the first and second classification probability distributions of the original face image. The feature extraction model is then trained based on the instance loss and the cluster loss.

[0053] Among them, homologous face images refer to images obtained by preprocessing an original face image or the same original face image.

[0054] Among them, a group of face images from the same source can refer to a group of face images consisting of at least two face images from the same source, obtained by preprocessing the same original face image or the same original face image.

[0055] Among them, homology similarity refers to the similarity of homology face images in a group of homology face images of the original face image.

[0056] Among them, non-homologous face images refer to images whose original face images are different from the original face images in the homologous face image group.

[0057] Among them, non-homologous similarity refers to the similarity between homologous face images and non-homologous face images in a homologous face image group.

[0058] The following sections provide detailed descriptions of each example. It should be noted that the order in which the embodiments are described is not intended to limit the preferred order of the embodiments.

[0059] This embodiment will be described from the perspective of a feature extraction model training device, which can be integrated into a computer device, such as a server or a terminal. The terminal can include tablet computers, laptops, personal computers (PCs), wearable devices, virtual reality devices, or other smart devices that can acquire data.

[0060] like Figure 2 As shown, the specific process of training this feature extraction model is as follows:

[0061] S101. Obtain the same source face image group corresponding to the original face image.

[0062] Among them, the homologous face image group includes at least two homologous face images of the original face image.

[0063] In this context, a homologous face image group refers to a group of face images consisting of at least two homologous face images obtained by preprocessing the same original face image or the same original face image. Homologous face images are images obtained by preprocessing an original face image or the same original face image. Preprocessing methods can include image augmentation techniques such as flipping, image cropping, and random grayscale conversion.

[0064] For example, the original face image The original face image is obtained after preprocessing. Two identical facial images and These two images are of the same origin. and For illustrative purposes only, this application is not limited to having only two source face images; the original face image can be an RGB image.

[0065] There can be multiple original face images, and therefore, there can be multiple sets of face images from the same source.

[0066] Among them, homologous face images from the same group can be called positive samples, and homologous face images from different groups can be called negative samples relative to homologous face images from the same group. For example, the original face image The original face image is obtained after preprocessing. A set of homologous face images C, which includes homologous face images and homologous face images Original face image The original face image is obtained after preprocessing. A set of homologous face images D, which includes homologous face images and homologous face images Homologous face images in homologous face image group C and homologous face images As positive samples, relative to the same-origin face image group C, the same-origin face images in the same-origin face image group D... and homologous face images This is a negative sample.

[0067] S102. Use a feature extraction model to extract features from face images in a group of face images from the same source to obtain a feature information set.

[0068] The feature information set includes facial image feature information corresponding to the original facial image.

[0069] Specifically, the feature extraction model can extract features from face images in a group of face images from the same source to obtain an initial feature information set, which includes the initial face image feature information corresponding to the original face image; and perform feature mapping on the initial feature information set to obtain the feature information set.

[0070] For example, using the two identical face images obtained above... and Taking a set of similar face images as an example, two similar face images... and Input feature extraction model, feature extraction model for homologous face images and homologous face images Feature extraction is performed, which may include processes such as convolution and pooling. Then, the images of people from the same source are analyzed. Mapped to Initial facial image feature information of dimension To obtain homologous face images Mapped to Initial facial image feature information of dimension , It can be 256, 512, or 1024.

[0071] This application's embodiments are explained using multiple original face images. The initial face image feature information obtained by processing multiple original face images can be as described above for homogeneous face images. and homologous face images The processing procedure is the same. For example, the original set of face images. The original face set includes n Zhang's original face image, n It is a positive integer. Original set of face images. After preprocessing, each original face image is augmented into two identical face images. nThe initial feature set is obtained by extracting features from the corresponding source face images of the original face image using a feature extraction model. ,in, and It is the original human face image. The feature information of the initial face image corresponding to the two original face images. i From 1 to n Any number between.

[0072] Feature extraction models can extract features from an initial feature set. Perform feature mapping to obtain the feature information set. Feature information set Includes facial image feature information from all source face images. , Represents a column vector. Represents a row vector. express A 3D vector space. Feature information set. Each row of facial image feature information can be considered as a re-mapping feature of the original facial image after preprocessing. This application embodiment uses... and Represents the original human face image The facial feature information corresponding to the two original facial images, i From 1 to n Any number between.

[0073] Feature extraction models can also be applied to the initial feature set. Perform feature mapping to obtain the feature information set. , , Represents a column vector. Represents a row vector. express A dimensional vector space.

[0074] S103. Calculate the homology similarity between homologous face images and the non-homologous similarity between homologous face images and non-homologous face images based on the face image feature information.

[0075] Among them, homology similarity refers to the similarity of homology face images in a group of homology face images of the original face image.

[0076] Among them, non-homologous face images refer to images whose original face images are different from the original face images in the homologous face image group.

[0077] Among them, non-homologous similarity refers to the similarity between homologous face images and non-homologous face images in a homologous face image group.

[0078] Specifically, computer equipment can calculate the homology similarity between original face images based on the distance between facial feature information of homology face images in the mapping space; and calculate the non-homology similarity between homology face images and non-homology face images based on the distance between facial image feature information of homology face images and facial image feature information of non-homology face images in the mapping space.

[0079] Both homology similarity and non-homology similarity can be expressed using cosine similarity. Formula for calculating cosine similarity The calculation formula is as follows:

[0080]

[0081] Here, a and b represent any two facial feature information.

[0082] Specifically, the non-homologous similarity includes a first non-homologous similarity and a second non-homologous similarity. The non-homologous face images include a first non-homologous face image and a second non-homologous face image, and the original face images of the first non-homologous face image and the second non-homologous face image are identical. For example, this explanation uses two sets of homologous face image groups, but the embodiments of this application are not limited to two sets of homologous face image groups. The two sets of homologous face image groups are homologous face image group A and homologous face image group B. The original face image corresponding to homologous face image group A is original face image A, and the original face image corresponding to homologous face image group B is original face image B. Relative to homologous face image group A, the homologous face images in homologous face image group B are non-homologous face images of homologous face image group A.

[0083] The computer device calculates the homology similarity between homologous face images of the original face image based on the distance between facial feature information of homologous face images in the mapping space; calculates the first non-homologous similarity between homologous face images of the original face image and the first non-homologous face image based on the distance between facial feature information of homologous face images and facial feature information of the first non-homologous face image in the mapping space; and calculates the second non-homologous similarity between homologous face images of the original face image and the second non-homologous face image based on the distance between facial feature information of homologous face images and facial feature information of the second non-homologous face image in the mapping space.

[0084] Specifically, computer equipment can calculate the homology similarity between homology face images based on the facial feature information of homology face images using a Gaussian kernel function or a cosine similarity function; and calculate the non-homology similarity between homology face images and non-homology face images based on the facial image feature information of homology face images and non-homology face images using a Gaussian kernel function or a cosine similarity function.

[0085] S104. Calculate the instance loss of the original face image based on homologous similarity and non-homologous similarity.

[0086] Instance loss represents the difference between the homologous similarity and the non-homologous similarity of the original face image. In this embodiment, the instance loss is constrained so that homologous similarity is higher than non-homologous similarity; higher similarity indicates greater similarity between images.

[0087] As mentioned above, the similarity between similarly sourced face images can be calculated based on the distance between the facial feature information of similarly sourced face images in the mapping space, and the similarity between non-similarly sourced face images can be calculated based on the distance between the facial image feature information of similarly sourced face images and the facial image feature information of non-similarly sourced face images in the mapping space. Therefore, the embodiment of this application constrains the instance loss, which can constrain the distance between the facial feature information of similarly sourced face images in the mapping space to be less than the distance between the facial image feature information of similarly sourced face images and the facial image feature information of non-similarly sourced face images in the mapping space.

[0088] Specifically, the computer device fuses the homologous similarity and the non-homologous similarity to obtain the difference loss between the homologous similarity of the original face image and the non-homologous similarity of the original face image.

[0089] In this context, the fusion of homologous and non-homologous similarity in computer devices can be achieved through a loss function, which can be an instance loss function. For any original image... The loss function for an example can be as follows:

[0090]

[0091] in, The hyperparameter that needs to be determined is set to 0.5. Indicates the similarity between sources. Both represent non-homologous similarity. and Represents the original human face image The facial feature information corresponding to the two original facial images, and Represents the original human face image The facial feature information corresponding to the two original facial images.

[0092] Specifically, non-homologous similarity can include a first non-homologous similarity and a second non-homologous similarity. The computer device calculates the instance loss of the original face image based on the homologous similarity, the first non-homologous similarity, and the second non-homologous similarity.

[0093] In this application embodiment, the aforementioned Considered as the first non-homologous similarity, the above... It is considered as the second non-homologous similarity.

[0094] S105. Perform feature mapping processing based on the facial image feature information to obtain the first classification probability distribution of the same source facial images in the classification category, and the second classification probability distribution of the same source facial images in the classification category.

[0095] In this system, computer equipment performs feature mapping processing based on facial image feature information to classify similar facial images. The computer equipment can either directly classify the facial image feature information, or it can remap the facial image feature information to obtain remapped facial image feature information, and then classify the remapped facial image feature information.

[0096] Specifically, such as Figure 3 As shown in the embodiment of this application, the process of calculating the first classification probability distribution and the second classification probability distribution is as follows:

[0097] A1. Classify homologous face images based on their facial image features to obtain the initial classification probability distribution of homologous face images in each category.

[0098] In the feature matrix of facial image feature information, each column or row can be considered a category. For example, each row of the feature matrix can be considered a category, and multiple rows can be considered multiple categories. By classifying the facial image feature information in each column of the feature matrix, the initial classification probability distribution of the facial image in each category can be obtained. For example, by classifying the facial image feature information in each row of the feature matrix, the initial classification probability distribution of the facial image in each category can be obtained. Each row of facial image feature information is mapped using a classification function, which can be either the sigmoid function or the softmax function. The feature information set... Become , as follows:

[0099]

[0100] for The The front of the column A column vector consisting of elements for The After the column A column vector consisting of n elements. That is, Each row in the image can be viewed as a source image of the same original face or a corresponding original face image. The classification probability of each category. Accordingly, Each column represents the probability distribution of all original face images corresponding to the same source face image for a single classification category, which can be regarded as a feature representation of a certain classification category.

[0101] A2. Split the initial classification probability distribution to obtain the first classification probability distribution of the first homologous face image in each category and the second classification probability distribution of the second homologous face image in each category.

[0102] For example, as mentioned above for The The front of the column A column vector consisting of n elements, that is, This includes In Mapped feature information; for The After the column A column vector consisting of n elements, that is, This includes In The mapped feature information. As mentioned above, and It is the original human face image. The feature information of the initial face image corresponding to the two original face images. i From 1 to n Any number between, This refers to the initial facial image feature information corresponding to the first homologous facial image. This refers to the initial face image feature information corresponding to the second homologous face image. Each column according to the previous each element and the last Cut each element apart, that is... By slicing, the initial probability distribution is split, resulting in the first classification probability distribution of the first homologous face image in each classification category. The probability distribution of the second classification of the second-originating face image in each classification category, i.e. .

[0103] S106. Calculate the cluster loss of the original face image based on the first classification probability distribution and the second classification probability distribution.

[0104] Here, cluster loss characterizes the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image.

[0105] Cluster loss can include the first cluster loss, the second cluster loss, or both the first and second cluster losses.

[0106] like Figure 4 As shown, when the cluster loss includes the first cluster loss, the specific process by which the computer device calculates the cluster loss of the original face image based on the first classification probability distribution and the second classification probability distribution is as follows:

[0107] B1. Calculate the first distribution difference information of the original face image in the target classification category based on the first classification probability distribution and the second classification probability distribution of the target classification category.

[0108] In this application, the embodiments can address the aforementioned Process, such as Convert to K The probability distribution of the first category Second classification probability distribution The new matrix formed by transposing and recombining K , K As shown below:

[0109]

[0110] in, for transpose, for The transpose of .

[0111] This application's embodiments utilize the aforementioned cosine similarity. The formula calculates the first distribution difference information, which is: cosine similarity The formula is detailed above and will not be repeated here.

[0112] B2. Based on the first classification probability distribution of the target classification category, the second classification probability distribution of the target classification category, the first classification probability distribution of the non-target classification category, and the second classification probability distribution of the non-target classification category, the second distribution difference information of the original face image between the target classification category and the non-target classification category is calculated.

[0113] Here, non-target category refers to a category that is different from the target category. For example, if the target category is category number 1... i The classification category is the nth category, and the non-target classification category is the nth category. j Each category will and This is considered as information about the second distribution difference.

[0114] Specifically, the second distribution difference information may include the first sub-distribution difference information and the second sub-distribution difference information, which can... The first sub-distribution difference information will Second sub-distribution difference information.

[0115] B3. Calculate the first cluster loss of the original face image based on the first distribution difference information and the second distribution difference information.

[0116] The first cluster loss represents the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image in the target classification category.

[0117] Specifically, the computer device, based on the first distribution difference information First sub-distribution difference information Second sub-distribution difference information Calculate the first cluster loss of the original face image.

[0118] For the The specific formula for the first cluster loss of each classification category is shown below:

[0119]

[0120] in, For the hyperparameters that need to be determined, In this embodiment, it is set to 0.5.

[0121] This application embodiment imposes constraints on the loss of the first cluster, and the constraints... and Similar, can be understood as and The difference between the two is less than the preset difference, that is, the difference between the first classification probability distribution and the second classification probability distribution of the original face image in the target classification category is less than the preset threshold.

[0122] B4. Generate cluster loss based on the first cluster loss.

[0123] like Figure 5 As shown, when the cluster loss includes the second cluster loss, the specific process by which the computer device calculates the cluster loss of the original face image based on the first and second classification probability distributions is as follows:

[0124] C1. Calculate the second cluster loss of the original face image based on the first classification probability distribution of the first homologous face image in each classification category and the second classification probability distribution of the second homologous face image in each classification category.

[0125] The second cluster loss characterizes the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image in different classification categories.

[0126] In this embodiment, the second cluster loss is calculated using the entropy average loss formula. The purpose of calculating the second cluster loss is to avoid degradation. Therefore, it is necessary to limit the probability of all homogeneous face images for different classification categories to remain average. For example, the classification categories include the first... i The first category and the first j The classification categories, all homologous face images in the category. i The sum of the probabilities of each of the n categories, and the sum of the probabilities of all homologous face images at the nth category. j The sum of the probabilities of each classification category is approximately equal. The formula for calculating the second cluster loss is as follows:

[0127]

[0128] C2. Generate cluster loss based on the second cluster loss.

[0129] Furthermore, in this embodiment, after performing feature mapping processing based on facial image feature information to obtain the first classification probability distribution and the second classification probability distribution of similar facial images in the classification category, as follows: Figure 6 As shown in the embodiments of this application, the classification loss of the original face image is also calculated, as follows:

[0130] D1. Cluster the same-origin face images based on the face image feature information to obtain the cluster label and the classification probability of the same-origin face image at the corresponding cluster label.

[0131] Specifically, in this embodiment of the application, after obtaining cluster labels through clustering, the facial image feature information of the same source face images is classified based on the cluster labels to obtain the classification probability of the same source face images corresponding to the cluster labels.

[0132] Specifically, such as Figure 7 As shown in the embodiments of this application, the specific process of obtaining the cluster label and the classification probability of homologous face images at the cluster label can be as follows:

[0133] d1. Cluster the same-origin face images based on their facial image feature information to obtain cluster labels.

[0134] In this application embodiment, the aforementioned feature information set may be used. Clustering is performed to separate the feature information set Clustering of facial image features yields m A cluster, m For real numbers, m This can be configured according to requirements. Then, cluster labels are assigned to the clusters. The cluster label assignment in each round follows the Hungarian algorithm to ensure that the cluster labels in the current round and the cluster labels in the previous round remain unchanged, thus maintaining the stability of the cluster label assignment in the feature extraction model.

[0135] In particular, since the embodiments of this application can be based on the aforementioned feature information set The instance loss is calculated, and through this instance loss, the inherent correlations and patterns in the sample data are learned. Specifically, this involves the correlations and patterns within homologous face image groups from the original face images, and homologous face image groups from different original face images. This enhances the feature extraction model's feature representation ability and its ability to distinguish between positive and negative samples. Therefore, this section focuses on the feature information set... Clustering with cluster labels yields higher clustering accuracy.

[0136] Specifically, in this embodiment, labels can be matched to facial image feature information to obtain facial image feature information carrying labels; the facial image feature information carrying labels can be clustered to obtain clusters; and the label with the most occurrences in the cluster can be determined as the cluster label corresponding to the cluster.

[0137] In this embodiment, the same label can be assigned to facial image feature information corresponding to the same or a single original facial image. When the original facial images are different, there can be multiple labels.

[0138] d2. Match the cluster labels with the classification categories to obtain the classification probability of homologous face images in the cluster labels.

[0139] In this embodiment, the clustering labels obtained from clustering are matched with the classification categories. That is, this embodiment matches the clustering labels with the aforementioned clustering categories. Matching, for supervision Training.

[0140] D2. Calculate the classification loss of different original face images for the target cluster label based on the classification probability.

[0141] In this embodiment, the classification loss is calculated using a loss function, which can be the cross-entropy loss function, and the formula for the cross-entropy loss is as follows:

[0142]

[0143] in, k For the first i The clustering label is determined by matching the facial feature information of the corresponding source face images of each original face image. This classification loss constraint maximizes the classification probability of the source face image corresponding to each original face image on its corresponding cluster label.

[0144] When the cluster loss includes the first cluster loss and the second cluster loss, the computer device generates the cluster loss based on the first cluster loss and the second cluster loss obtained above.

[0145] S107. Train the feature extraction model based on instance loss and cluster loss.

[0146] In this embodiment, instance loss and cluster loss can be fused to obtain the fused loss, which is the total loss function. The parameters of the feature extraction model are adjusted based on the total loss function to train the feature extraction model.

[0147] Specifically, in this embodiment of the application, the feature extraction model can also be trained based on instance loss, cluster loss, and classification loss.

[0148] This application embodiment can obtain a group of homologous face images corresponding to the original face image. The homologous face image group includes at least two homologous face images of the original face image. A feature extraction model is used to extract features from the face images in the homologous face image group to obtain a feature information set, which includes the face image feature information corresponding to the original face image. Based on the face image feature information, the homologous similarity between homologous face images of the original face image and the non-homologous similarity between homologous face images and non-homologous face images are calculated. Based on the homologous similarity and non-homologous similarity, the instance loss of the original face image is calculated. The loss represents the difference between the homologous similarity and the non-homologous similarity of the original face images. Feature mapping is performed based on the face image feature information to obtain the first and second classification probability distributions of homologous face images in different classification categories. The cluster loss of the original face images is calculated based on the first and second classification probability distributions, representing the difference between the first and second classification probability distributions. The feature extraction model is trained based on the instance loss and cluster loss. Because the feature extraction model in this embodiment can mine the inherent correlation and data patterns of sample data through instance loss and cluster loss—that is, the inherent correlation and data patterns of homologous face image groups of the original face images and homologous face image groups corresponding to different original face images—the feature extraction model has stronger feature expression capabilities and positive / negative sample discrimination capabilities, thereby improving the accuracy of clustering in the feature extraction model.

[0149] The following will provide a detailed explanation using the example of how the feature extraction model training device can be integrated into computer equipment and how the feature extraction model can be applied to general face images. General face images refer to cartoon character face images, for example, such as... Figure 8 As shown.

[0150] S201. The computer device acquires a group of source face images corresponding to the original face image.

[0151] Among them, the homologous face image group includes at least two homologous face images of the original face image.

[0152] Among them, such as Figure 9 As shown, the computer device in this embodiment reads multiple original face images through a sample reading module. Each original face image is processed by a sample preprocessing module and a sample data amplification module to obtain two corresponding homologous face images, for example, the original face image. After processing by the sample preprocessing module and the sample data amplification module, two identical face images were obtained. and The embodiments of this application also perform the same processing on other original face images.

[0153] S202. The computer equipment uses a feature extraction model to extract features from face images in a group of face images from the same source, and obtains a feature information set.

[0154] The feature information set includes facial image feature information corresponding to the original facial image.

[0155] Specifically, the feature extraction model can extract features from face images in a group of face images from the same source to obtain an initial feature information set, which includes the initial face image feature information corresponding to the original face image; and perform feature mapping on the initial feature information set to obtain the feature information set.

[0156] The computer equipment uses a sample feature extraction module of a feature extraction model to extract features from face images in a group of face images from the same source, obtaining an initial feature information set, such as the original face image set. , n The integers are positive. After being processed by the computer equipment's sample reading module, sample preprocessing module, sample data amplification module, and sample feature extraction module, an initial feature information set is obtained. , and It is the original human face image. The feature information of the initial face image corresponding to the two original face images. i From 1 to n Any number between.

[0157] The instance self-supervised module of computer devices uses the initial feature information set Perform feature mapping to obtain the feature information set. G Feature information set Including facial image feature information, , Represents a column vector. Represents a row vector. express A dimensional vector space. The embodiments of this application use... and Represents the original human face image The facial feature information corresponding to the two original facial images, i From 1 to n Any number between.

[0158] The cluster self-supervised module of computer equipment for the initial feature set Perform feature mapping to obtain the feature information set. , , Represents a column vector. Represents a row vector. express A dimensional vector space.

[0159] S203. The computer equipment calculates the homology similarity between homologous face images and the non-homologous similarity between homologous face images and non-homologous face images based on the facial image feature information.

[0160] Specifically, the instance self-supervised module can calculate the homology similarity between homology face images based on the distance between facial feature information of homology face images in the mapping space; and calculate the non-homology similarity between homology face images and non-homology face images based on the distance between facial image feature information of homology face images and facial image feature information of non-homology face images in the mapping space.

[0161] Both homology similarity and non-homology similarity can be expressed using cosine similarity. Formula for calculating cosine similarity The formula is detailed above and will not be repeated here.

[0162] Specifically, the instance self-supervised module calculates the homology similarity between homology face images of the original face image based on the distance between the facial feature information of homology face images in the mapping space; calculates the first non-homology similarity between homology face images of the original face image and the first non-homology face image based on the distance between the facial image feature information of homology face images and the facial image feature information of the first non-homology face image in the mapping space; and calculates the second non-homology similarity between homology face images of the original face image and the second non-homology face image based on the distance between the facial image feature information of homology face images and the facial image feature information of the second non-homology face image in the mapping space.

[0163] S204. The computer device calculates the instance loss of the original face image based on the homologous similarity and the non-homologous similarity.

[0164] Instance loss represents the difference between the homologous similarity and the non-homologous similarity of the original face image. This application's embodiments constrain the instance loss by requiring homologous similarity to be higher than non-homologous similarity; higher similarity indicates greater similarity between images.

[0165] As mentioned above, the similarity between similarly sourced face images can be calculated based on the distance between the facial feature information of similarly sourced face images in the mapping space, and the similarity between non-similarly sourced face images can be calculated based on the distance between the facial image feature information of similarly sourced face images and the facial image feature information of non-similarly sourced face images in the mapping space. Therefore, the embodiment of this application constrains the instance loss, which can constrain the distance between the facial feature information of similarly sourced face images in the mapping space to be less than the distance between the facial image feature information of similarly sourced face images and the facial image feature information of non-similarly sourced face images in the mapping space.

[0166] Specifically, the computer device fuses the homologous similarity and non-homologous similarity to obtain the difference loss between the homologous similarity and non-homologous similarity of the original face image. The method by which the computer device fuses the homologous and non-homologous similarity can be implemented using an instance loss function, which is detailed above and will not be repeated here.

[0167] S205. The computer equipment performs feature mapping processing based on the facial image feature information to obtain the first classification probability distribution of the same source facial images in the classification category, and the second classification probability distribution of the same source facial images in the classification category.

[0168] Specifically, the cluster self-supervised module classifies homologous face images based on face image feature information to obtain the initial classification probability distribution of homologous face images in each classification category; the initial classification probability distribution is then split to obtain the first classification probability distribution of the first homologous face image in each classification category and the second classification probability distribution of the second homologous face image in each classification category.

[0169] In the feature matrix of facial image feature information, each column of the feature matrix is ​​considered as a classification category, and multiple columns of the feature matrix are considered as multiple classification categories. By classifying the facial image feature information in each row of the feature matrix, the initial classification probability distribution of the facial image in each category can be obtained. For example, the aforementioned feature information set... Each row of facial image feature information is mapped using a classification function, such as the softmax function, and the feature information set. Become . As detailed above, it will not be repeated here.

[0170] Will Each column according to the previous each element and the last Cut each element apart, that is... By slicing, the initial probability distribution is split, resulting in the first classification probability distribution of the first homologous face image in each classification category. The probability distribution of the second classification of the second homologous face image in each classification category, i.e. .

[0171] S206. The computer device calculates the cluster loss of the original face image based on the first classification probability distribution and the second classification probability distribution.

[0172] Here, cluster loss characterizes the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image.

[0173] The cluster loss includes the first cluster loss and the second cluster loss.

[0174] Specifically, the cluster self-supervised module calculates the first distribution difference information of the original face image in the target classification category based on the first classification probability distribution and the second classification probability distribution of the target classification category; calculates the second distribution difference information of the original face image between the target classification category and the non-target classification category based on the first classification probability distribution, the second classification probability distribution of the target classification category, the first classification probability distribution of the non-target classification category, and the second classification probability distribution of the non-target classification category; and calculates the first cluster loss of the original face image based on the first distribution difference information and the second distribution difference information.

[0175] The first cluster loss represents the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image in the target classification category.

[0176] In this application, the embodiments can address the aforementioned Process, such as Convert to K The probability distribution of the first category Second classification probability distribution The new matrix formed by transposing and recombining K New matrix K As detailed above, it will not be repeated here.

[0177] The second distribution difference information may include the first sub-distribution difference information and the second sub-distribution difference information. The instance self-supervised module calculates the first cluster loss of the original face image based on the first distribution difference information, the first sub-distribution difference information, and the second sub-distribution difference information. The calculation formula for the first cluster loss is detailed above and will not be repeated here.

[0178] Specifically, the cluster self-supervised module calculates the second cluster loss of the original face image based on the first classification probability distribution of the first homologous face image in each classification category and the second classification probability distribution of the second homologous face image in each classification category.

[0179] The second cluster loss characterizes the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image across different classification categories. The calculation formula for the second cluster loss is detailed above and will not be repeated here.

[0180] S207. The computer equipment clusters homogeneous face images based on facial image feature information to obtain cluster labels and the classification probability of homogeneous face images at the corresponding cluster labels.

[0181] Specifically, computer devices can perform offline clustering of facial features using an instance self-supervised module. Specifically, the instance self-supervised module clusters homogeneous facial images based on facial image feature information to obtain cluster labels. For example, the instance self-supervised module can match labels to facial image feature information to obtain facial image feature information carrying labels; cluster the facial image feature information carrying labels to obtain clusters; and determine the label with the most occurrences in the cluster as the cluster label corresponding to the cluster.

[0182] Then, the computer device uses the clustering labels obtained from the instance self-supervised module as a supervision signal through the interactive category alignment module to supervise the cluster self-supervised module in class assignment and training. The computer device matches the clustering labels obtained from the instance self-supervised module with the classification categories obtained from the cluster self-supervised module through the interactive category alignment module; that is, in this embodiment, the clustering labels are matched with the aforementioned obtained... Matching, for supervision Training.

[0183] S208. The computer equipment calculates the classification loss of different original face images in the target clustering label based on the classification probability.

[0184] In this embodiment, the classification loss is calculated using a loss function, which can be the cross-entropy loss function. The formula for the cross-entropy loss is detailed above and will not be repeated here.

[0185] S209. The computer equipment trains the feature extraction model based on instance loss, cluster loss, and classification loss.

[0186] Among them, such as Figure 10 As shown, in this embodiment of the application, instance loss, cluster loss and classification loss can be fused to obtain the fused loss, which is also the total loss function. The feature extraction model is trained based on the total loss function. The cluster loss can include a first cluster loss and a second cluster loss.

[0187] This application embodiment can obtain a group of homologous face images corresponding to the original face image. The homologous face image group includes at least two homologous face images of the original face image. A feature extraction model is used to extract features from the face images in the homologous face image group to obtain a feature information set, which includes the face image feature information corresponding to the original face image. Based on the face image feature information, the homologous similarity between homologous face images of the original face image and the non-homologous similarity between homologous face images and non-homologous face images are calculated. Based on the homologous similarity and non-homologous similarity, the instance loss of the original face image is calculated. The loss represents the difference between the homologous similarity and the non-homologous similarity of the original face images. Feature mapping is performed based on the face image feature information to obtain the first and second classification probability distributions of homologous face images in different classification categories. The cluster loss of the original face images is calculated based on the first and second classification probability distributions, representing the difference between the first and second classification probability distributions. The feature extraction model is trained based on the instance loss and cluster loss. Because the feature extraction model in this embodiment can mine the inherent correlation and data patterns of sample data through instance loss and cluster loss—that is, the inherent correlation and data patterns of homologous face image groups of the original face images and homologous face image groups corresponding to different original face images—the feature extraction model has stronger feature expression capabilities and positive / negative sample discrimination capabilities, thereby improving the accuracy of clustering in the feature extraction model.

[0188] The embodiments of this application can effectively improve the clustering accuracy of feature extraction models on a wide range of faces and reduce the misclassification rate of feature extraction models.

[0189] The feature extraction model in this application essentially explores the data correlation within each level by performing self-supervised learning at two levels: the instance self-supervised module and the cluster self-supervised module. It deeply mines the data patterns of different datasets, that is, the intrinsic correlation and data patterns of homologous face image groups of original face images and homologous face image groups corresponding to different original face images. This makes the feature extraction model have stronger feature expression ability and positive and negative sample discrimination ability, thereby improving the clustering accuracy of the feature extraction model.

[0190] Furthermore, this embodiment of the application aligns the clustering results of the instance self-supervised module and the cluster self-supervised module at two levels through an interactive category alignment module, enabling the exploration of cross-level data correlations. This embodiment of the application, by aligning the instance self-supervised module through the interactive category alignment module, can simultaneously focus on feature learning and clustering tasks, using a single-stage learning approach to link the originally separate clustering and feature learning, thereby improving the clustering accuracy of the feature extraction model.

[0191] The embodiments of this application can be applied to general object clustering, including faces, and can also assist in subsequent data annotation work.

[0192] The embodiments of this application can be used to test the trained feature extraction model, such as... Figure 11 As shown, the original face image is processed sequentially through the sample reading module, sample preprocessing module, and feature extraction module to obtain the test face feature information. The test face feature information is then clustered to obtain the clustering labels of the original face image, thus achieving higher clustering accuracy.

[0193] Among them, such as Figure 12 As shown, the original face images include generic face images and real face images, with the real face images being filtered out in the trained feature extraction model. Figure 11 ID_1, ID_2, ID_3 and the remaining images are the clustering labels corresponding to the original face images.

[0194] To better implement the above methods, this application also provides a feature extraction model training device, which can be integrated into a computer device, such as a server or terminal. The terminal may include a tablet computer, a laptop computer, and / or a personal computer.

[0195] For example, such as Figure 13 As shown, the feature extraction model training device may include an acquisition unit 301, an extraction unit 302, a first calculation unit 303, a second calculation unit 304, a mapping unit 305, a third calculation unit 306, and a training unit 307, as follows:

[0196] (1) Obtain unit 301;

[0197] The acquisition unit 301 can be used to acquire a group of source face images corresponding to the original face image, wherein the group of source face images includes at least two source face images of the original face image.

[0198] (2) Extraction unit 302;

[0199] The extraction unit 302 can be used to extract features from face images in a group of face images from the same source using a feature extraction model, and obtain a feature information set, which includes the face image feature information corresponding to the original face image.

[0200] (3) First calculation unit 303;

[0201] The first calculation unit 303 can be used to calculate the homology similarity between homologous face images of the original face image and the non-homologous similarity between homologous face images and non-homologous face images based on the face image feature information.

[0202] (4) Second calculation unit 304;

[0203] The second calculation unit 304 can be used to calculate the instance loss of the original face image based on homologous similarity and non-homologous similarity. The instance loss represents the difference loss between the homologous similarity of the original face image and the non-homologous similarity of the original face image.

[0204] (5) Mapping unit 305;

[0205] The mapping unit 305 can be used to perform feature mapping processing based on the feature information of the face image to obtain the first classification probability distribution of the same face image in the classification category and the second classification probability distribution of the same face image in the classification category.

[0206] The mapping unit 305 can be used to classify homologous face images based on face image feature information to obtain the initial classification probability distribution of homologous face images in each classification category; and to split the initial classification probability distribution to obtain the first classification probability distribution of the first homologous face image in each classification category and the second classification probability distribution of the second homologous face image in each classification category.

[0207] (6) Third calculation unit 306;

[0208] The third calculation unit 306 can be used to calculate the cluster loss of the original face image based on the first classification probability distribution and the second classification probability distribution. The cluster loss characterizes the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image.

[0209] The third calculation unit 306 can specifically be used to calculate the first distribution difference information of the original face image in the target classification category based on the first classification probability distribution and the second classification probability distribution of the target classification category; calculate the second distribution difference information of the original face image between the target classification category and the non-target classification category based on the first classification probability distribution, the second classification probability distribution of the target classification category, the first classification probability distribution of the non-target classification category, and the second classification probability distribution of the non-target classification category; calculate the first cluster loss of the original face image based on the first distribution difference information and the second distribution difference information, wherein the first cluster loss characterizes the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image in the target classification category; and generate a cluster loss based on the first cluster loss.

[0210] The third calculation unit 306 can be specifically used to calculate the second cluster loss of the original face image based on the first classification probability distribution of the first homologous face image in each classification category and the second classification probability distribution of the second homologous face image in each classification category. The second cluster loss represents the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image in different classification categories; and a cluster loss is generated based on the second cluster loss.

[0211] (7) Training Unit 307;

[0212] Training unit 307 can be used to train the feature extraction model based on instance loss and cluster loss.

[0213] Training unit 307 can be used to train the feature extraction model based on instance loss, cluster loss, and classification loss.

[0214] (8) Clustering unit 308;

[0215] Clustering unit 308 can be used to cluster homogeneous face images based on face image feature information to obtain cluster labels and the classification probability of homogeneous face images at the corresponding cluster labels; and calculate the classification loss of different original face images at the target cluster labels based on the classification probability.

[0216] Clustering unit 308 can be used to cluster homologous face images based on face image feature information to obtain cluster labels; and to match the cluster labels with the classification categories to obtain the classification probability of homologous face images in the corresponding cluster labels.

[0217] Clustering unit 308 can be used to match labels to facial image feature information to obtain facial image feature information carrying labels; to cluster the facial image feature information carrying labels to obtain clusters; and to determine the label with the most numbers in the cluster as the cluster label corresponding to the cluster.

[0218] As can be seen from the above, the acquisition unit 301 of this application embodiment can acquire a group of homologous face images corresponding to the original face image, the group of homologous face images including at least two homologous face images of the original face image; the extraction unit 302 uses a feature extraction model to extract features from the face images in the group of homologous face images to obtain a feature information set, the feature information set including the face image feature information corresponding to the original face image; the first calculation unit 303 can calculate the homologous similarity between homologous face images of the original face image and the non-homologous similarity between homologous face images and non-homologous face images based on the face image feature information; the second calculation unit 304 can calculate the real similarity of the original face image based on the homologous similarity and the non-homologous similarity. Example loss, instance loss represents the difference loss between the homologous similarity and the non-homologous similarity of the original face images; mapping unit 305 can perform feature mapping processing according to the face image feature information to obtain the first classification probability distribution and the second classification probability distribution of homologous face images in the classification category; third calculation unit 306 can calculate the cluster loss of the original face images according to the first and second classification probability distributions, and the cluster loss represents the difference loss between the first and second classification probability distributions of the original face images; training unit 307 can train the feature extraction model according to the instance loss and cluster loss. Since the feature extraction model of this embodiment can mine the inherent correlation and data pattern of sample data through instance loss and cluster loss, that is, the inherent correlation and data pattern of homologous face image groups of original face images and homologous face image groups corresponding to different original face images, the feature extraction model has stronger feature expression ability and positive and negative sample discrimination ability, thereby improving the clustering accuracy of the feature extraction model.

[0219] This application also provides a computer device, such as... Figure 14 As shown, it illustrates a structural schematic diagram of the computer device involved in the embodiments of this application, specifically:

[0220] The computer device may include components such as a processor 401 with one or more processing cores, a memory 402 with one or more computer-readable storage media, a power supply 403, and an input unit 404. Those skilled in the art will understand that... Figure 14 The computer device structure shown does not constitute a limitation on the computer device and may include more or fewer components than shown, or combine certain components, or have different component arrangements. Wherein:

[0221] The processor 401 is the control center of the computer device. It connects various parts of the computer device via various interfaces and lines. By running or executing software programs and / or modules stored in the memory 402, and by calling data stored in the memory 402, it performs various functions of the computer device and processes data, thereby performing overall detection and control of the computer device. Optionally, the processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user interface, and computer programs, and the modem processor mainly handles wireless communication. It is understood that the modem processor may not be integrated into the processor 401.

[0222] The memory 402 can be used to store software programs and modules. The processor 401 executes various functional applications and data processing by running the software programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area. The program storage area may store the operating system, computer programs required for at least one function (such as sound playback function, image playback function, etc.), etc.; the data storage area may store data created according to the use of the computer device, etc. In addition, the memory 402 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 with access to the memory 402.

[0223] The computer device also includes a power supply 403 that supplies power to the various components. Preferably, the power supply 403 can be logically connected to the processor 401 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system. The power supply 403 may also include one or more DC or AC power supplies, recharging systems, power fault detection circuits, power converters or inverters, power status indicators, and other arbitrary components.

[0224] The computer device may also include an input unit 404, which can be used to receive input digital or character information communication, and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

[0225] Although not shown, the computer device may also include a display unit, etc., which will not be described in detail here. Specifically, in this embodiment, the processor 401 in the computer device loads the executable files corresponding to the processes of one or more computer programs into the memory 402 according to the following instructions, and the processor 401 runs the computer programs stored in the memory 402 to realize various functions, as follows:

[0226] Obtain a group of homologous face images corresponding to the original face image. This group includes at least two homologous face images from the original face image. Use a feature extraction model to extract features from the face images in the homologous face image group, obtaining a feature information set. This feature information set includes the feature information of the face images corresponding to the original face image. Calculate the homologous similarity between homologous face images and the non-homologous similarity between homologous and non-homologous face images based on the face image feature information. Calculate the instance loss of the original face image based on the homologous and non-homologous similarities. The instance loss represents the original... The model employs a method to calculate the difference loss between the homologous similarity of the original face image and the non-homologous similarity of the original face image. It then performs feature mapping based on the face image features to obtain the first and second classification probability distributions of homologous face images within their respective categories. Finally, it calculates the cluster loss of the original face image based on these two probability distributions, which represents the difference loss between the first and second classification probability distributions. The feature extraction model is then trained using the instance loss and cluster loss.

[0227] For details on the implementation of each of the above operations, please refer to the previous examples, which will not be repeated here.

[0228] Those skilled in the art will understand that all or part of the steps in the various methods of the above embodiments can be performed by a computer program, or by a computer program controlling related hardware. The computer program can be stored in a computer-readable storage medium and loaded and executed by a processor.

[0229] Therefore, embodiments of this application provide a computer-readable storage medium storing a computer program that can be loaded by a processor to execute any of the feature extraction model training methods provided in embodiments of this application.

[0230] For details on the implementation of each of the above operations, please refer to the previous examples, which will not be repeated here.

[0231] The computer-readable storage medium may include: read-only memory (ROM), random access memory (RAM), disk or optical disk, etc.

[0232] Since the instructions stored in the computer-readable storage medium can execute the steps in any of the feature extraction model training methods provided in the embodiments of this application, the beneficial effects that any of the feature extraction model training methods provided in the embodiments of this application can achieve can be realized, as detailed in the preceding embodiments, and will not be repeated here.

[0233] According to one aspect of this application, a computer program product or computer program is provided, comprising computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the methods provided in the various optional implementations of the above embodiments.

[0234] The foregoing has provided a detailed description of a feature extraction model training method and related apparatus provided in the embodiments of this application. The related apparatus includes a feature extraction model training device, a computer device, and a computer-readable storage medium. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the method and its core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.

Claims

1. A method for training a feature extraction model, characterized in that, include: Obtain at least two original face images and their corresponding homologous face image groups. Each homologous face image group includes at least two face images, and all face images in the same homologous face image group are homologous face images corresponding to the same original face image. A feature extraction model is used to extract features from the face images in the same group of face images to obtain a feature information set, which includes the face image feature information corresponding to the original face image; The instance self-supervised module in the feature extraction model calculates the homology similarity between homologous face images and the non-homologous similarity between non-homologous face images based on the face image feature information. Any two face images belonging to different homologous face image groups are non-homologous face images. The instance self-supervised module is used to calculate the instance loss of the original face image based on the homology similarity and the non-homology similarity. The instance loss represents the difference loss between the homology similarity and the non-homology similarity of the original face image. The cluster self-supervised module in the feature extraction model performs feature mapping processing based on the facial image feature information to obtain the first classification probability distribution of the same facial images in the classification category and the second classification probability distribution of the same facial images in the classification category. The instance self-supervised module is used to cluster the homologous face images based on the face image feature information to obtain cluster labels; The interactive category alignment module in the feature extraction model is used to match the cluster label with the classification category to supervise the cluster self-supervised module in class allocation and training, so as to obtain the classification probability of the homologous face image in the cluster label. The cluster self-supervised module is used to calculate the classification loss of different original face images on the target cluster label based on the classification probability. The classification loss constrains the homologous face images corresponding to each original face image to have the maximum classification probability on the corresponding cluster label. The cluster self-supervised module is used to calculate the cluster loss of the original face image based on the first classification probability distribution and the second classification probability distribution. The cluster loss represents the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image. The feature extraction model is trained based on the instance loss, the cluster loss, and the classification loss.

2. The feature extraction model training method according to claim 1, characterized in that, The instance self-supervised module clusters the homologous face images based on the face image feature information to obtain cluster labels, including: The facial image feature information is matched with tags to obtain facial image feature information carrying tags; Clustering is performed on the facial image feature information carrying the tags to obtain clusters; The label with the most occurrences in the cluster is determined as the cluster label corresponding to the cluster.

3. The feature extraction model training method according to claim 1, characterized in that, The face image group includes a first homologous face image and a second homologous face image; the step of using the cluster self-supervised module to perform feature mapping processing based on the face image feature information to obtain a first classification probability distribution of the homologous face images in the classification category and a second classification probability distribution of the homologous face images in the classification category includes: The homologous face images are classified according to the facial image feature information to obtain the initial classification probability distribution of the homologous face images in each classification category; The initial classification probability distribution is split to obtain the first classification probability distribution of the first homologous face image in each classification category and the second classification probability distribution of the second homologous face image in each classification category.

4. The feature extraction model training method according to claim 3, characterized in that, The step of calculating the cluster loss of the original face image based on the first classification probability distribution and the second classification probability distribution includes: The first distribution difference information of the original face image in the target classification category is calculated based on the first classification probability distribution of the target classification category and the second classification probability distribution of the target classification category; The second distribution difference information of the original face image between the target classification category and the non-target classification category is calculated based on the first classification probability distribution of the target classification category, the second classification probability distribution of the target classification category, the first classification probability distribution of the non-target classification category, and the second classification probability distribution of the non-target classification category. The first cluster loss of the original face image is calculated based on the first distribution difference information and the second distribution difference information. The first cluster loss represents the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image in the target classification category. The cluster loss is generated based on the first cluster loss.

5. The feature extraction model training method according to claim 3, characterized in that, The step of calculating the cluster loss of the original face image based on the first classification probability distribution and the second classification probability distribution includes: Based on the first classification probability distribution of the first homologous face image in each classification category and the second classification probability distribution of the second homologous face image in each classification category, the second cluster loss of the original face image is calculated. The second cluster loss represents the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image in different classification categories. The cluster loss is generated based on the second cluster loss.

6. A feature extraction model training device, characterized in that, include: The acquisition unit is used to acquire at least two original face images and their corresponding homologous face image groups. Each homologous face image group includes at least two face images, and all face images in the same homologous face image group are homologous face images corresponding to the same original face image. The extraction unit is used to extract features from the face images in the same group of face images using a feature extraction model to obtain a feature information set, wherein the feature information set includes the face image feature information corresponding to the original face image; The first calculation unit is used to calculate the homology similarity between homologous face images and the non-homologous similarity between non-homologous face images based on the face image feature information using the instance self-supervised module in the feature extraction model. Any two face images belonging to different homologous face image groups are non-homologous face images. The second calculation unit is used to calculate the instance loss of the original face image based on the homologous similarity and the non-homologous similarity using the instance self-supervised module. The instance loss represents the difference loss between the homologous similarity and the non-homologous similarity of the original face image. The mapping unit is used to perform feature mapping processing based on the facial image feature information using the cluster self-supervised module in the feature extraction model, to obtain the first classification probability distribution of the same facial images in the classification category, and the second classification probability distribution of the same facial images in the classification category; The clustering unit is used to cluster the same face images according to the face image feature information using the instance self-supervised module to obtain cluster labels; and to match the cluster labels with the classification categories using the interactive category alignment module in the feature extraction model to supervise the cluster self-supervised module in class allocation and training, so as to obtain the classification probability of the same face images in the cluster labels. The cluster self-supervised module is used to calculate the classification loss of different original face images on the target cluster label based on the classification probability. The classification loss constrains the homologous face images corresponding to each original face image to have the maximum classification probability on the corresponding cluster label. The third calculation unit is used to calculate the cluster loss of the original face image based on the first classification probability distribution and the second classification probability distribution using the cluster self-supervised module. The cluster loss represents the difference loss between the first classification probability distribution and the second classification probability distribution of the original face image. The training unit is used to train the feature extraction model based on the instance loss, the cluster loss, and the classification loss.

7. A computer device, characterized in that, It includes a memory and a processor; the memory stores a computer program, and the processor is used to run the computer program in the memory to perform the feature extraction model training method according to any one of claims 1 to 5.

8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program adapted for loading by a processor to execute the feature extraction model training method according to any one of claims 1 to 5.