Model training method based on knowledge distillation and image processing method and device

A model training and image processing technology, applied in the field of image recognition, can solve problems such as inability to obtain optimal performance, limited capacity of student models, and impossibility to perfectly learn teacher models.
CN111242297APending Publication Date: 2020-06-05MEGVII BEIJINGTECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
MEGVII BEIJINGTECH CO LTD
Publication Date
2020-06-05

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention provides a model training method based on knowledge distillation, which is applied to a student model, and comprises the steps of setting a second output layer which is the same as a first output layer at a distillation position according to the distillation position; obtaining a training set, wherein the training set comprises a plurality of training data; obtaining first data output by the first output layer and second data output by the second output layer based on the training data; obtaining supervision data output by the teacher model in a teacher layer corresponding to thedistillation position based on the training data, wherein the teacher model is a complex model which completes training and completes the same task as the student model; based on the difference between the supervision data and the first data and the second data, obtaining a distillation loss value according to a distillation loss function; and updating parameters of the student model based on thedistillation loss value. Through the disclosed embodiment, the teacher model in knowledge distillation is more emphasized to transmit the knowledge of simple data to the student model, the training efficiency of knowledge distillation is improved, and the accuracy of the student model is ensured.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present disclosure generally relates to the field of image recognition, and specifically relates to a knowledge distillation-based model training method, a knowledge distillation-based model training device, an image processing method, an image processing device, electronic equipment, and a computer-readable storage medium. Background technique

[0002] With the development of artificial intelligence recognition, models are widely used for data processing, image recognition, etc. While the recognition accuracy and recognition range are continuously improved, the neural network is also becoming larger and larger, and the calculation is time-consuming, with many parameters and requires a huge storage capacity. Therefore, it is difficult to apply it to mobile terminals, especially mobile terminals with poor hardware equipment.

[0003] Knowledge distillation is a model compression method. In the teacher-student framework, the feature representation "k...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More