Unlock instant, AI-driven research and patent intelligence for your innovation.

Multi-module knowledge distillation method based on MoCo model

A distillation method, multi-module technology, applied in neural learning methods, character and pattern recognition, biological neural network models, etc., to achieve the effect of improving accuracy, good update, and reducing errors

Active Publication Date: 2022-07-22
CHINA UNIV OF MINING & TECH
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a multi-module knowledge distillation method based on the MoCo model, which solves the problem of training large-scale data sets in the case of limited memory, and achieves In order to reduce the amount of calculation and improve the effect of memory efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-module knowledge distillation method based on MoCo model
  • Multi-module knowledge distillation method based on MoCo model
  • Multi-module knowledge distillation method based on MoCo model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0065] The multi-module knowledge distillation method based on MoCo model of the present invention, the steps are as follows:

[0066] Step S1: Randomly collect 5,000 labeled images in Imagenet, unify the sizes of these 5,000 images one by one, and then perform data enhancement to obtain 10,000 images with a pixel size of 256×256, which constitute the teacher network training set.

[0067] Step S2: Input the teacher network training set into the teacher network, and use the teacher network training set to pre-train the teacher network to obtain a pre-trained teacher network.

[0068] Step S3: Randomly collect 50,000 images in Instagram, unify the sizes of these 50,000 images one by one, and then perform data enhancement to obtain 100,000 images with a pixel size of 256×256, which constitute a teacher-student network training set.

[0069] Step S4, build a MoCo model of multi-module knowledge distillation:

[0070] The MoCo model includes a pre-training teacher network and a s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multi-module knowledge distillation method based on a MoCo model, and the method comprises the steps: dividing a teacher network and a student network into a plurality of corresponding modules through the characteristic that features generated in an intermediate process have similarity, extracting the features generated by each module of the teacher network and the student network through the MoCo model, calculating the similarity, and carrying out the knowledge distillation of the teacher network and the student network. And the purpose that the teacher network guides the student network is achieved by using the similarity. According to the method, the sample features can be automatically and dynamically updated on the basis of only a small number of tags, the memory efficiency of the method is higher, the problem of training a large-scale data set under the condition of a limited memory is solved, and a student network under the guidance of a teacher network has robustness and generalization at the same time.

Description

technical field [0001] The invention belongs to a model lightweight technology, and in particular relates to a multi-module knowledge distillation method based on a MoCo model. Background technique [0002] In recent years, machine learning and deep learning have made remarkable progress in computer vision, natural language processing, prediction, and audio processing. Deploying it on devices is difficult. In knowledge distillation, a larger cumbersome network (teacher model) trained on a large dataset can transfer the learned knowledge well to a smaller and lighter network as a student model. [0003] In the research based on cues from slender networks, a two-stage strategy is introduced to train deep networks, but there is no significant speed improvement; deep mutual learning proposes teacher-student networks to learn from each other and update at the same time, but it is difficult to extract learning More detailed information brings greater errors; in the regeneration ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06V10/774G06V10/776G06V10/778G06V10/82G06V10/74G06V10/772G06N3/08G06N3/04G06K9/62
CPCG06N3/08G06N3/084G06N3/088G06N3/045G06F18/28G06F18/2155G06F18/217G06F18/22G06F18/214
Inventor 王军袁静波刘新旺李玉莲李兵
Owner CHINA UNIV OF MINING & TECH