Model compression method based on pruning sequence active learning

A technology of active learning and compression method, applied in the field of neural network models, it can solve the problems of not considering the influence of the convolution layer, the pruning strategy is too simple, and the overall model accuracy is reduced, so as to minimize the loss of model accuracy and achieve strong practical application. Foreground, the effect of compressing the model volume

Inactive Publication Date: 2019-04-19
TSINGHUA UNIV
View PDF0 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Existing methods focus on evaluating the importance of convolution kernels, but the pruning strategy is too simple, resulting in unsatisfactory results
However, there is an important phenomenon that has been overlooked, and each convolutional layer has different importance: if a very small number of convolutional kernels are subtracted from an important convolutional layer, the accuracy of the overall model may also be greatly reduced; conversely, If on an unimportant convolutional layer, even if a large number of convolution kernels are cut off, the accuracy will hardly be affected. The method of sequential pruning or global pruning obviously does not take into account the importance of each convolutional layer to the pruning effect. The effect of branching

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model compression method based on pruning sequence active learning
  • Model compression method based on pruning sequence active learning
  • Model compression method based on pruning sequence active learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The model compression method provided by the present invention will be described in detail below by taking pruning only for the convolutional layer as an example.

[0034] A model compression method based on active learning of pruning order, such as figure 1 shown, including:

[0035] S1. Use LSTM (Long Short-Term Memory) to learn the timing characteristics of the network, and make a decision whether each convolutional layer needs to be pruned;

[0036] S2. Evaluate and cut the convolution kernel in the selected convolution layer. The convolution kernel evaluation method takes into account the correlation between the two convolution layers before and after, and uses a non-data-driven method to quickly evaluate the importance of the convolution kernel, and A recovery mechanism is proposed to restore the model accuracy immediately after pruning;

[0037]S3. Use the tutor network to accelerate retraining of the pruned model;

[0038] S4. Calculate the expressive power a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a model compression method based on pruning sequence active learning. an end-to-end pruning framework based on sequential active learning is provided; The method has the advantages that the importance of all layers of the network can be actively learned, the pruning priority is generated, a reasonable pruning decision is made, the problem that an existing simple sequential pruning method is unreasonable is solved, pruning is preferentially carried out on the network layer with the minimum influence, pruning is carried out step by step from simplification to difficulty, and the model precision loss in the pruning process is minimized; And meanwhile, the final loss of the model is taken as a guide, and the importance of the convolution kernel is evaluated in a multi-angle, efficient, flexible and rapid manner, so that the compression correctness and effectiveness of the whole-process model are ensured, and technical support is provided for subsequent transplantation of a large model to portable equipment. Experimental results show that the model compression method based on pruning sequence active learning provided by the invention is leading under the conditions of multiple data sets and multiple model structures, can greatly compress the model volume under the condition of ensuring the model precision, and has a very strong practical application prospect.

Description

technical field [0001] The invention belongs to the technical field of neural network models, in particular to a model compression method based on active learning of pruning order. Background technique [0002] In recent years, with the vigorous development of deep neural networks, the academic and industrial circles have witnessed major breakthroughs in deep learning in many fields such as computer vision and natural language processing. The expressive power of convolutional neural network (CNN) in some visual fields even exceeds the visual processing ability of human beings. [0003] Although deep networks have made significant breakthroughs in the field of vision, the size and computation of the models have become the bottlenecks in their practical applications. In-depth network applications in real-world scenarios require fast computing power, a large amount of storage space, and battery capacity attached to hardware. Large-scale neural networks can be efficiently run ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/08
CPCG06N3/082G06N3/045
Inventor 丁贵广钟婧
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products