Image classification method, electronic equipment, storage medium and program product
A classification method and image technology, applied in the field of image processing, can solve problems that affect the accuracy of the image classification model, the number of category sample images is unbalanced, and the image classification model cannot learn and distinguish well
Pending Publication Date: 2022-04-22
BEIJING KUANGSHI TECH +2
0 Cites 0 Cited by
AI-Extracted Technical Summary
Problems solved by technology
If the number of sample images in each category is unbalanced in the sample set used for model training, it may cause the image classification model to fail to learn and distinguish the characteristics of each category of sample images, thereby affecting t...
Method used
Adopt the technical scheme of the embodiment of the present application, basic loss is determined based on the classification prediction result of sample image and true class label, therefore use basic loss to preset model to train, can guide preset model to sample image When classifying, try to classify the sample image into the true category to which the sample image belongs. The inter-class loss is determined based on the classification prediction result of the sample image and the confidence that each sample image is predicted to be the true class label of the sample image; therefore, the pre-set model is trained using the inter-class loss, for each sample image , which can make the preset model misclassify the sample image of other categories into the real category of the sample image, which is equal to the confidence of misclassifying the sample image into other categories; the idea of confrontation training is adopted to realize the misclassification correction. Therefore, training the preset model with both base loss and inter-class loss can lead to an image classification model with high accuracy.
Adopt the technical scheme of the embodiment of the present application, the classification prediction result of average multi-round training sets up confusion matrix, can make the confusion matrix obtained more accurately, avoid the result of one training to seriously affect the training effect of model, thereby improve th...
Abstract
The invention provides an image classification method, electronic equipment, a storage medium and a program product, relates to the technical field of image processing, and aims to realize accurate classification of to-be-classified images by using an image classification model. The method comprises the following steps: acquiring a to-be-classified image; inputting the to-be-classified image into an image classification model to obtain a classification prediction result of the to-be-classified image, the image classification model being obtained by training a preset model by using basic loss and inter-class loss; the basic loss is determined according to a classification prediction result of each sample image predicted by the preset model and a real category label of each sample image; the inter-class loss is determined according to the classification prediction result of each sample image predicted by the preset model and the soft class label of each sample image, and the soft class label of one sample image is determined according to the confidence coefficient that each sample image is predicted as the real class of the sample image.
Application Domain
Character and pattern recognitionStill image data clustering/classification
Technology Topic
Sample imageEngineering +3
Image
Examples
- Experimental program(1)
Example Embodiment
[0060] In order to make the above purposes, features and advantages of the present application more obvious and understandable, the following in conjunction with the accompanying drawings and specific embodiments of the present application will be further detailed description.
[0061] In recent years, important progress has been made in the research of computer vision, deep learning, machine learning, image processing, image recognition and other technologies based on artificial intelligence. Artificial Intelligence (AI) is an emerging science and technology that researches and develops theories, methods, technologies and application systems used to simulate and extend human intelligence. The discipline of artificial intelligence is a comprehensive discipline involving chips, big data, cloud computing, Internet of Things, distributed storage, deep learning, machine learning, neural networks and many other technologies. Computer vision as an important branch of artificial intelligence, specifically to make machines recognize the world, computer vision technology usually includes face recognition, living body detection, fingerprint recognition and anti-counterfeiting verification, biometric recognition, face detection, pedestrian detection, object detection, pedestrian recognition, image processing, image recognition, image semantic understanding, image retrieval, text recognition, video processing, video content recognition, behavior recognition, three-dimensional reconstruction, virtual reality, augmented reality, synchronous positioning and map construction (SLAM), Computational photography, robot navigation and positioning technologies. With the research and progress of artificial intelligence technology, this technology has been applied in many fields, such as security, urban management, traffic management, building management, park management, face traffic, face attendance, logistics management, warehouse management, robots, intelligent marketing, computational photography, mobile phone imaging, cloud services, smart home, wearable devices, unmanned driving, automatic driving, intelligent medical, face payment, face unlocking, fingerprint unlocking, ID verification, smart screen, smart TV, camera, mobile Internet, webcasting , beauty, beauty, medical beauty, intelligent temperature measurement and other fields.
[0062] In order to solve the technical problem that the accuracy of the image classification model in the relevant technology is not high, the applicant proposes to train the model using both the basic loss and the interclass loss to improve the accuracy of the image classification model obtained.
[0063] Reference Figure 1 Shown, illustrates a step flowchart of an image classification method in an embodiment of the present invention, such as Figure 1 As shown, the image classification method may specifically include the following steps:
[0064] Step S11: Acquire the image to be classified;
[0065] Step S12: the image to be classified into the image classification model, to obtain the classification prediction result of the image to be classified, wherein the image classification model is trained using the underlying loss and the interclass loss, the preset model is obtained; the base loss is based on the preset model prediction of each sample image of the classification prediction result, and the true category label of each sample image determined; the inter-class loss is the classification prediction result of each sample image predicted according to the preset model, And the soft category label of each sample image is determined, and the soft category label of a sample image is determined based on the confidence that each sample image is predicted to be the true category of the sample image.
[0066] Classifying classified images can refer to classifying the image itself (e.g., the category of images can be landscape images, people images, animal images, etc.), or it can refer to classifying foreground objects in the image (e.g., classifying photos of cats as cats, photos of dogs as dogs). Embodiment of the present application to classify the foreground object in the image as an example, the image classification method is explained, it will be appreciated that the image itself is classified is also a similar idea.
[0067] The image to be classified is entered into the image classification model, the image classification model can predict the confidence degree of the image to be classified belongs to each category, and the confidence degree of the image to be classified belongs to each category is output as the classification prediction result of the image to be classified, or the category with the highest confidence in each category is output as the classification prediction result of the image to be classified.
[0068] Image classification models are trained on preset models using base losses and interclass losses. The classification prediction of the sample image determined by the preset model during training is the confidence that the sample image belongs to each category. Each category refers to the respective category and background category of all sample images used during training. You can first set the order of each category, and then generate a vector based on the confidence that the sample image determines by the preset model belongs to each category, and use the vector to represent the classification prediction result of the sample image. For example, there are a total of four categories of cat, dog, pig, background, set the order of the above four categories is: cat, dog, pig, background, the confidence of a sample image determined by the preset model belongs to each category is: cat -0.5, dog -0.3, pig -0.1, background 0.1, then the preset model to determine the classification prediction results of the sample image can be expressed in vectors [0.5, 0.3, 0.1, 0.1].
[0069] Alternatively, in order to make the preset model try to learn the characteristics of the sample images of various types other than the background category, you can delete the confidence level of the background category in the classification prediction result, and then normalize the remaining confidence levels. Following the example above, the classification prediction [0.5,0.3,0.1,0.1] After removing the confidence level of the background category, the new classification prediction result can be obtained that the new classification prediction result is about [0.56, 0.33, 0.11].
[0070] The real category label of the sample image characterizes the category to which the sample image really belongs, corresponding to the classification prediction result of the sample image, the true category label can also be a vector, and the order of the categories represented by each element in the vector of the real category label is the same as the order of the categories represented by each element in the classification prediction result of the sample image. Continuing with the previous example, if a sample image really belongs to a category of dogs, the true category label of the sample image can be [0,1,0,0], of course, after removing the background category, the true category label of the sample image can be [0,1,0].
[0071] For each sample image, the underlying loss can be established based on the classification prediction results of the sample image and the true category label of the sample image. Using the underlying loss to train a preset model, you can guide the preset model to classify the sample image as much as possible into the real category to which the sample image belongs.
[0072] You can also train a preset model using interclass losses, which are determined based on the classification prediction results of the sample image and the soft category label, and the soft category label of a sample image is determined based on the confidence level of the true category of each sample image being predicted as the sample image. It is understandable that the soft category labels for sample images of the same real category are the same. Corresponding to the classification prediction result of the sample image, the soft category label can also be a vector, and the order of the categories represented by each element in the vector of the soft category label is the same as the order of the categories represented by each element in the classification prediction result of the sample image.
[0073] For example, all sample images have a total category of cats, dogs, pigs, a real category of the sample image of a dog soft category label is all sample images are predicted to be dogs confidence, if all cat sample images are predicted to be dogs with an average confidence of 0.2, all dog sample images are predicted as a dog with an average confidence of 0.6, all pig sample images are predicted to be a dog with an average confidence of 0.1, then the soft category label of the dog sample image may be [0.2, 0.6, 0.1], optionally, This soft category label can be normalized to [0.22, 0.67, 0.11].
[0074] Using inter-class loss to train the preset model, for each sample image, the preset model can be used to misclassify the sample images of other categories into the true category of the sample image, which is equal to the confidence of the sample image mistakenly divided into other categories; in this way, an adversarial training is formed, guiding the model to distinguish the difference between the various sample images that are predicted to be of the same category, and realizing the correction of the error classification.
[0075] Using the technical solution of the present application embodiment, the basal loss is determined based on the classification prediction result of the sample image and the true class label, so the preset model is trained using the base loss, the preset model may be guided to classify the sample image, try to divide the sample image into the real category to which the sample image belongs. The inter-class loss is determined based on the classification prediction result of the sample image and the confidence of the true class label of the sample image being predicted; therefore, the preset model is trained using the inter-class loss, and for each sample image, the preset model can be mistakenly divided into the true category of the sample image of the sample image, which is equal to the confidence degree of the sample image being mistakenly divided into other categories; the idea of adversarial training is used to achieve correction of the error classification. Therefore, the preset model can be trained on both the basis loss and the interclass loss, and the image classification model with higher accuracy can be obtained.
[0076] Alternatively, the image classification model is a preset model using multiple sample images carrying real class labels, based on the underlying loss and inter-class loss, the preset model is trained. Alternatively, the underlying loss is based on the difference between the classification prediction result of each sample image and the true class label of the sample image, the established cross-entropy loss, the inter-class loss is based on the difference between the classification prediction result of each sample image and the soft category label of the sample image, the established cross-entropy loss.
[0077] For the problem that some sample images are easily mistakenly divided into other categories, the use of inter-class loss to train the preset model can prompt the model to pay attention to the differences between the sample images that are easily mistakenly divided into another category, so that the model "recognizes" the difference between the sample images of the class and the sample images of the other category. For example, the real category of the sample image is cats or dogs, and the preset model is easy to predict a part of the cat (e.g., hairless cat) as a dog, you can generate a soft category label based on the confidence that each sample image is predicted to be a dog, and use the soft category label to establish an inter-class loss to train the model, so that the model pays more attention to the differences between the sample images (hairless cat and the real dog) that are predicted as dogs, so that the trained model can distinguish between hairless cats and dogs as much as possible.
[0078] Alternatively, when training the model, it is possible to calculate the interclass loss only for the class sample images with a high probability of being incorrectly predicted, and calculate the class of the sample images of the class that are easily misclassed.
[0079] Alternatively, inter-class losses can be established using a confusion matrix, a visualization tool that more intuitively reflects the true categories of the sample images, the classification predictions of the preset model to the sample images, and the connections between the real categories and the classification predictions. Embodiment of the present application to characterize the real category of the sample image in rows, to characterize the prediction category of the sample image in a list, it will be appreciated that the representations of rows and columns may also be reversed, and other technical means are adjusted accordingly.
[0080] The elements in the confusion matrix are determined based on the average classification prediction of each type of sample image. Each row of the confusion matrix represents the average classification prediction result of each sample image of the real category corresponding to the row, and each list represents the average confidence that the sample image of each category is predicted to be the prediction category corresponding to the column.
[0081] Figure 2 A schematic diagram of a confusion matrix is shown, which is characterized by the meaning that Images of each sample with a real category of cats were predicted to have a confidence of 0.6 for a cat, 0.3 for a dog, and 0.1 for a pig; 0.2 for a cat with a real category of dogs, 0.7 for a dog, and 0.1 for a pig; 0.1 for a pig with a real category of pigs; 0.1 for a pig with a real category of pigs. Among them, the first list of this confusion matrix enlisted all samples, and the probability of all cat samples being predicted to be cats was 0.6, the probability of all dog samples being incorrectly predicted to be cats was 0.2, and the probability of all pig samples being incorrectly predicted as cats was 0.1.
[0082] The column vector corresponding to each sample image represents the vector composed of various elements in the column where the real category of the sample image corresponds to the prediction category in the confusion matrix, and the difference between the classification prediction results of each sample image and the corresponding column vector of the sample image can be established. For example Figure 2 The soft category label for all cat sample images in the medium is [0.6,0.2,0.1], the soft category label for all dog sample images is [0.3,0.7,0.2], and the soft category label for all pig sample images is [0.1,0.1,0.7].
[0083] During actual training, when the preset model determines the confidence that each sample image belongs to each category, it also determines the confidence that each sample image belongs to the background category. Alternatively, in order to make the preset model try to learn the characteristics of various types of sample images other than the background category, improve the classification accuracy of the trained image classification model, when establishing the confusion matrix, only the columns belonging to the background category can be deleted, and then the remaining elements in the confusion matrix can be normalized; or only each sample image can be predicted to be a variety of categories other than the background category Of confidence, each sample image is predicted to be a variety of confidence in addition to the background category for normalization, The normalized prediction result of each sample image is obtained, and then the confusion matrix is established based on the normalized prediction result of each sample image and the real category label carried.
[0084] Optionally, multiple rounds of training are performed on the preset model, and the classification prediction results of each sample image obtained by each round of training are accumulated to obtain a confusion matrix. It can be directly based on the classification prediction results of each sample image in each round of training, generate a confusion matrix of the training batch, and then average the elements at the corresponding positions in multiple confusion matrices to obtain the elements of the final confusion matrix, and then establish the confusion matrix. Figure 3 Schematic diagram of the final confusion matrix is shown to obtain the final confusion matrix by averaging the confusion matrix of different training batches, and the final confusion matrix is obtained in the figure.
[0085] It can also be directly averaged the classification prediction results of each sample image in multiple training batches, and a confusion matrix is established based on the averaged classification prediction results. Averaging the classification prediction results means that the confidence level of the category is averaged, for example, one classification prediction result of one sample image is cat -0.5, dog -0.3, and another classification prediction result is cat 0.3 and dog 0.1, then the classification prediction result of these two times is averaged, and the average post-classification prediction result obtained is cat -0.4 and dog -0.2.
[0086] Optionally, averaging the confusion matrix of different training batches, and averaging the classification prediction results of different training batches, can be an exponential moving average, and the exponential moving average increases the weight of the classification prediction result of the training batch later in time.
[0087] Using the technical scheme of the embodiment of the present application, the classification prediction result of the average multi-round training is established to establish a confusion matrix, which can make the result of the confusion matrix more accurate, to avoid the results of a training that seriously affects the training effect of the model, thereby improving the accuracy of the trained image classification model.
[0088] Alternatively, as an embodiment, the preset model is a classification branch of the untrained instance segmentation model, the untrained instance segmentation model further comprises a position prediction branch; the following steps may be taken to train the classification branch of the untrained instance segmentation model, to obtain the classification branch in the instance segmentation model:
[0089] Step S21: To obtain the image characteristics of the image sample comprising a sample object, the sample object carries its own true category label.
[0090] The classification branch in the instance segmentation model is trained using image samples, including at least one sample object, each of which carries its own true class label. Obtain the image characteristics of an image sample, including the color characteristics, texture features, shape features, spatial relationship features, and so on of the image sample. In the embodiment of the present application, the image feature extraction method of the image sample is not limited.
[0091] Step S22: The image features of the image sample are input into the untrained instance segmentation model, the first prediction position box of each sample object in the image sample of the position prediction branch output is obtained, and the first prediction category of each sample object in the image sample output by the classification branch.
[0092] Alternatively, the image features of the image sample can be imported into an untrained instance segmentation model, or the image samples can be directly fed into an untrained instance segmentation model, and the image characteristics of the image sample are extracted by the instance segmentation model.
[0093] The position prediction branch in an instance segmentation model predicts the location of sample objects in an image sample, and the classification branch predicts the category of sample objects. When training an instance segmentation model, the position prediction branch outputs the first prediction position box of each sample object in the image sample, and the first prediction category of each sample object in the image sample output by the classification branch.
[0094] Step S23: Based on the first prediction position box of each sample object and the first prediction category, the image characteristics of the image sample are updated.
[0095] After obtaining the first prediction position box and the first prediction category, the features of the first prediction position box, as well as the characteristics of the first prediction category, are integrated into the image features of the image sample to obtain the updated image features.
[0096] Step S24: Based on the image characteristics of the updated image sample, the second prediction category of each sample object in the image sample output by the classification branch is obtained.
[0097] Based on the updated image characteristics, the classification branch can output a second prediction category for each sample object in the image sample.
[0098] Step S25: Based on the second predicted category of each sample object in each sample, and the difference between the true class labels of each sample object, the base loss of the classification branch is established.
[0099] Based on the second prediction category of each sample object, as well as the true class label of each sample object, the basic loss of the classification branch can be established, and the specific method of establishing the basic loss of the classification branch can refer to the method of establishing the basic loss of the classification model above, which will not be repeated here.
[0100] Step S26: Establish the inter-class loss of the classification branch according to the difference between the second prediction category of each sample object in each sample, and the confidence level of each sample object being predicted as the true class of the sample object;
[0101] Based on the second prediction category of each sample object, and the confidence degree of each sample object being predicted as the true class of the sample object, the interclass loss of the classification branch can be established, and the specific method of establishing the interclass loss of the classification branch may refer to the method of establishing the interclass loss of the classification model above, which will not be repeated here.
[0102] Step S27: Based on the base loss and interclass loss of the classification branch, the classification branch is trained.
[0103] Based on the basic loss and interclass loss of the classification branch, the classification branch is trained to obtain the trained classification branch.
[0104] Using the technical scheme of the embodiment of the present application, when training the classification branch, the first prediction position box of the classification branch and the first prediction category are used to update the image features, and the updated image features are then classified as sample objects, and a more accurate second prediction category can be obtained; at the same time, the basic loss of the classification branch and the interclass loss are used to train the classification branch, which can improve the accuracy of the classification branch obtained by training.
[0105] Alternatively, as an embodiment, may be a plurality of rounds of training on the classification branch, in each round of training are obtained the second prediction position box of each sample object of the position prediction classification prediction, and based on the second prediction position box and the second prediction category of the image features after another update, to obtain a second update of the image features, and the classification branch after the round of training as the intermediate branch.
[0106] Taking the intermediate branch as the classification branch, the second prediction position box as the first prediction position box, and the second prediction category as the first prediction category, repeat the steps of training the classification branch to obtain the intermediate branch until the preset conditions are met, stop the training of the classification branch, and use the last intermediate branch to split the trained classification branch of the model as an example. Where the preset condition can be that the accuracy of the instance segmentation model reaches a preset threshold. In the actual training, the applicant found that repeating the steps of the classification branch 2 or 3 times to train the intermediate branch can obtain the classification branch with the better training effect. It is understandable that the base loss and interclass loss of the classification branch are constantly updated with the results of each round of classification predictions.
[0107] That is to say, after obtaining the image features after the second update, the next round of training is carried out as follows: using the image features after the second update, based on the basic loss and interclass loss of the classification branch, and then training the classification branch to obtain the third prediction category of the sample object, and to obtain the third prediction position box of the position prediction branch output.
[0108] Optionally, the basic loss and interclass loss of the classification branch have their own corresponding weights, and the direct weight given to the interclass loss of the classification branch may affect the classification accuracy of the classification branch. Therefore, the interclass loss of the classification branch can be gradually increased in each round of training.
[0109] After the classification branch of the instance segmentation model is trained, the unclassified image containing the object to be classified is entered into the instance segmentation model, and the prediction category of each object to be classified in the unclassified image determined by the classification branch of the instance segmentation model can be obtained. Where, the object to be classified in the unclassified image is the foreground image in the unclassified image.
[0110] It is understandable that the basic loss and interclass loss of the classification branch are only used when the classification branch is trained, and the classification branch of the trained instance segmentation model has also undergone multiple updates of the image features inside the model when it is applied in practice, but it does not involve the basic loss and interclass loss of the classification branch.
[0111] It should be noted that, for the method embodiment, for a simple description, it is expressed as a series of combinations of actions, but those skilled in the art should be aware that embodiments of the present invention are not limited by the order of action described, because according to embodiments of the present invention, certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also be aware that the embodiments described in the specification are preferred embodiments, and the actions involved are not necessarily necessarily necessary for embodiments of the present invention.
[0112] Figure 4 is a schematic structural diagram of an image classification apparatus embodiment of the present invention, e.g., as Figure 4 As shown, an image classification apparatus, comprising an acquisition module and a classification module, wherein:
[0113] Acquisition module for obtaining images to be classified;
[0114] Classification module, for inputting the image to be classified into the image classification model, to obtain the classification prediction result of the image to be classified, wherein the image classification model is trained using the underlying loss and the interclass loss; the basic loss is based on the preset model prediction of each sample image, and the true category label of each sample image is determined; the interclass loss is the classification prediction result of each sample image predicted according to the preset model, And the soft category label of each sample image is determined, and the soft category label of a sample image is determined based on the confidence that each sample image is predicted to be the true category of the sample image.
[0115] Alternatively, as an embodiment, the training process of the image classification model comprises the following steps:
[0116] Obtain a plurality of sample images carrying real category labels, and enter the preset model to obtain the classification prediction results of each sample image;
[0117] Establish the underlying loss based on the difference between the classification prediction results of each sample image and the true category label of the sample image;
[0118] According to the difference between the classification prediction result of each sample image and the soft category label of the sample image, the inter-class loss is established;
[0119] Based on the basal loss and the interclass loss, the preset model is trained to obtain the classification model.
[0120] Alternatively, as an embodiment, according to the difference between the classification prediction results of each sample image and the soft category label of the sample image, the inter-class loss is established, comprising:
[0121] According to the classification prediction result of each sample image and the real category label carried by the sample image, a confusion matrix is established, the element representation in the confusion matrix: the real category is a plurality of samples of the category of the row in which the element is located, the average confidence of the category of the column in which the element is predicted;
[0122] Take the column vector corresponding to each sample image in the confusion matrix as the soft category label of the sample image, and the column vector corresponding to a sample image characterizes the confidence that each sample is predicted to be the true class of the sample;
[0123] According to the difference between the classification prediction result of each sample image and the corresponding column vector of the sample image, the inter-class loss is established.
[0124] Alternatively, as an embodiment, according to the classification prediction results of each sample image and the true category label carried by the sample image, a confusion matrix is established, comprising:
[0125] Gets the confidence that each sample image is predicted to be in a category other than the background category;
[0126] The confidence level of each sample image is predicted to be different types other than the background category, and the normalized prediction result of each sample image is obtained;
[0127] A confusion matrix is established based on the normalized prediction results of the images of each sample and the true category labels carried.
[0128] Alternatively, as an embodiment, the classification prediction results of each sample image are obtained, comprising:
[0129] Obtain the classification prediction results of each sample image in different training batches;
[0130] A confusion matrix is established based on the classification prediction results of each sample image and the real category labels carried by the sample image, including:
[0131] According to the true category label of each sample image and the classification prediction results of the sample image in each training batch, the confusion matrix of different training batches is established;
[0132] The elements of the corresponding position in the confusion matrix of the different training batches are averaged, and the elements of the confusion matrix are obtained, and then the confusion matrix is established.
[0133] Alternatively, as an embodiment, the preset model is an untrained instance segmentation model classification branch, the untrained instance segmentation model further comprises a position prediction branch; the instance segmentation model in the classification branch training process comprises at least the following steps:
[0134] To obtain the image characteristics of an image sample comprising a sample object, the each sample object carries its own true category label;
[0135] The image features of the image sample are fed into the untrained instance segmentation model, the first prediction position box of each sample object in the image sample of the predicted branch output of the position prediction branch is obtained, and the first prediction category of each sample object in the image sample output by the classification branch;
[0136] Based on the first prediction position box of each sample object and the first prediction category, the image characteristics of the image sample are updated;
[0137] Based on the image characteristics of the updated image sample, the second prediction category of each sample object in the image sample output by the classification branch is obtained;
[0138] According to the second prediction category of each sample object in each sample, and the difference between the true class labels of each sample object, the base loss of the classification branch is established;
[0139] According to the difference between the second prediction category of each sample object in each sample, and the confidence level of each sample object being predicted as the true class of the sample object, the interclass loss of the classification branch is established;
[0140] Based on the underlying loss and interclass loss of the classification branch, the classification branch is trained.
[0141] Alternatively, as an embodiment, the apparatus further comprises:
[0142] Second acquisition module for obtaining the position prediction branch based on the image sample updated image feature output of the second prediction position box of each sample object in the image sample;
[0143] Based on the underlying loss and interclass loss of the classification branch, the classification branch is trained, comprising:
[0144] Based on the basic loss of the classification branch and its weight and the inter-class loss and weight of the classification branch, the classification branch is trained to obtain the intermediate branch;
[0145] Taking the intermediate branch as the classification branch, the second prediction position box as the first prediction position box, the second prediction category as the first prediction category, repeat the steps of training the classification branch to obtain the intermediate branch;
[0146] Take the last obtained intermediate branch as the classification branch of the model split by the instance.
[0147] Alternatively, as an embodiment, repeat the steps of training the classification branch to obtain an intermediate branch, comprising:
[0148] In the process of repeatedly training the classification branch to obtain the intermediate branch, the weight of the interclass loss of the classification branch is gradually increased.
[0149] Alternatively, as an embodiment, the apparatus further comprises:
[0150] Image acquisition module for acquiring unclassified images containing objects to be classified;
[0151] Category prediction module, for entering the unclassified image into the instance segmentation model, to obtain the prediction category of the unclassified image to be classified in the unclassified image determined by the classification branch of the instance segmentation model.
[0152] It should be noted that the apparatus embodiment and the method embodiment are similar, so the description is relatively simple, and the relevant points can be referred to in the method embodiment.
[0153] Embodiments of the present invention further provides an electronic device, comprising a memory, a processor and a computer program stored on the memory, the processor executes the computer program to achieve the image classification method as disclosed in the present application embodiment.
[0154] Embodiments of the present invention further provides a computer-readable storage medium on which a computer program / instruction is stored, the computer program / instruction is executed by the processor to implement the image classification method as disclosed in the present application embodiment.
[0155] Embodiments of the present invention further provides a computer program product, comprising a computer program / instruction, the computer program / instruction is executed by the processor to implement the image classification method as disclosed in the embodiments of the present application.
[0156] Each embodiment in the present specification is described in a progressive manner, each embodiment focuses on the differences with other embodiments, and the same similar parts between each embodiment can be seen with each other.
[0157] Those skilled in the art will appreciate that embodiments of the present invention may be provided as a method, apparatus or computer program product. Thus, embodiments of the present invention may take the form of a complete hardware embodiment, a complete software embodiment, or a combination of software and hardware embodiments. Further, embodiments of the present invention may be in the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk memory, CD-ROM, optical memory, etc.) containing computer-usable program code.
[0158] Embodiments of the present invention are described with reference to a flowchart and / or block diagram of a method, apparatus, electronic device and computer program product according to an embodiment of the present invention. It should be understood that each process and / or box in the flowchart and / or block diagram may be implemented by computer program instructions, and a combination of the flowchart and / or block diagram in the flowchart and / or block diagram. These computer program instructions may be provided to a processor of a general purpose computer, a dedicated computer, an embedded processor or other programmable data processing terminal device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing terminal device are generated for implementation in the process Figure 1 or more processes and/or boxes Figure 1 A device with a function specified in a box or multiple boxes.
[0159] These computer program instructions may also be stored in a computer-readable memory capable of guiding a computer or other programmable data processing terminal device to work in a particular manner, such that the instructions stored in the computer-readable memory generate a manufacturing product comprising a command device, the instruction device is implemented in the process Figure 1 or more processes and/or boxes Figure 1 or multiple boxes for the function specified.
[0160] These computer program instructions may also be loaded onto a computer or other programmable data processing terminal device, such that a series of operational steps are performed on a computer or other programmable terminal device to produce a computer-implemented processing, so that the instructions executed on a computer or other programmable terminal device are provided for implementing the process Figure 1 or more processes and/or boxes Figure 1 A step for a function specified in a box or multiple boxes.
[0161] Although preferred embodiments of embodiments of the present invention have been described, those skilled in the art once learned of the basic inventive concepts, these embodiments may be additionally changed and modified. Thus, the appended claims are intended to be construed to include preferred embodiments and all changes and modifications falling within the scope of embodiments of the present invention.
[0162] Finally, it should also be noted that, in this article, relational terms such as first and second, etc., are used only to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that there is any such actual relationship or order between these entities or operations. Further, the term "comprising", "comprising" or any other variation thereof is intended to cover non-exclusive inclusion, so that a process, method, article or terminal device comprising a series of elements includes not only those elements, but also other elements not expressly listed, or also elements inherent in such processes, methods, articles or terminal equipment. In the absence of further restrictions, the elements defined by the statement "including a ...", do not exclude the existence of additional identical elements in the process, method, article or terminal device comprising the elements.
[0163] The above is an image classification method provided by the present application, electronic devices, storage media and program products, in detail, the application of specific examples in this article to the principle and embodiment of the present application has been elaborated, the description of the above embodiments is only used to help understand the method of the application and its core ideas; at the same time, for those skilled in the art, according to the ideas of the present application, there will be changes in the specific embodiment and scope of application, in summary, The contents of this specification shall not be construed as a restriction on the present application.
PUM


Description & Claims & Application Information
We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.