Model training method based on knowledge distillation and image processing method and device

A model training and image processing technology, applied in the field of image recognition, can solve problems such as inability to obtain optimal performance, limited capacity of student models, and impossibility to perfectly learn teacher models.

Pending Publication Date: 2020-06-05
MEGVII BEIJINGTECH CO LTD
View PDF0 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the traditional method treats all training samples equally and makes the student model imitate the teacher model as much as possible. However, the capacity of the student model is limited after all, and it is impossible to perfectly learn all the knowledge of the teacher model. Blind imitation often cannot achieve optimal performance.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model training method based on knowledge distillation and image processing method and device
  • Model training method based on knowledge distillation and image processing method and device
  • Model training method based on knowledge distillation and image processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The principle and spirit of the present disclosure will be described below with reference to several exemplary embodiments. It should be understood that these embodiments are given only to enable those skilled in the art to better understand and implement the present disclosure, rather than to limit the scope of the present disclosure in any way.

[0028] It should be noted that although expressions such as "first" and "second" are used herein to describe different modules, steps, data, etc. of the embodiments of the present disclosure, expressions such as "first" and "second" are only for A distinction is made between different modules, steps, data, etc., without implying a particular order or degree of importance. In fact, expressions such as "first" and "second" can be used interchangeably.

[0029] In order to make the process of training the student network through knowledge distillation more efficient, and transfer more reliable and useful knowledge from the comp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a model training method based on knowledge distillation, which is applied to a student model, and comprises the steps of setting a second output layer which is the same as a first output layer at a distillation position according to the distillation position; obtaining a training set, wherein the training set comprises a plurality of training data; obtaining first data output by the first output layer and second data output by the second output layer based on the training data; obtaining supervision data output by the teacher model in a teacher layer corresponding to thedistillation position based on the training data, wherein the teacher model is a complex model which completes training and completes the same task as the student model; based on the difference between the supervision data and the first data and the second data, obtaining a distillation loss value according to a distillation loss function; and updating parameters of the student model based on thedistillation loss value. Through the disclosed embodiment, the teacher model in knowledge distillation is more emphasized to transmit the knowledge of simple data to the student model, the training efficiency of knowledge distillation is improved, and the accuracy of the student model is ensured.

Description

technical field [0001] The present disclosure generally relates to the field of image recognition, and specifically relates to a knowledge distillation-based model training method, a knowledge distillation-based model training device, an image processing method, an image processing device, electronic equipment, and a computer-readable storage medium. Background technique [0002] With the development of artificial intelligence recognition, models are widely used for data processing, image recognition, etc. While the recognition accuracy and recognition range are continuously improved, the neural network is also becoming larger and larger, and the calculation is time-consuming, with many parameters and requires a huge storage capacity. Therefore, it is difficult to apply it to mobile terminals, especially mobile terminals with poor hardware equipment. [0003] Knowledge distillation is a model compression method. In the teacher-student framework, the feature representation "k...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/08
CPCG06N3/08
Inventor 张有才戴雨辰常杰危夷晨
Owner MEGVII BEIJINGTECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products