Method of compressing a pre-trained deep neural network model

一种深度神经网络、模型的技术,应用在压缩预先训练的深度神经网络模型领域,能够解决模型无法安装、深度学习模型大、深度学习模型无法从零开始学习等问题,达到确保准确性、降低深度神经网络规模的效果

Inactive Publication Date: 2019-12-10
KNERON TAIWAN CO LTD
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to large data sets, powerful computing power and memory, deep learning models are becoming larger and more complex
However, these huge models cannot be installed on end-user devices with low memory and limited computing power, such as mobile phones and embedded devices
Furthermore, developing custom deep learning models for end-user devices cannot learn from scratch due to limited data sets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method of compressing a pre-trained deep neural network model
  • Method of compressing a pre-trained deep neural network model
  • Method of compressing a pre-trained deep neural network model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] The model compression method proposed by the present invention removes unnecessary layers in a deep neural network (Deep Neural Network, DNN), and automatically introduces a sparse structure into a calculation-intensive layer.

[0026] figure 1 is a schematic diagram of the model compression architecture 100 of an embodiment. The model compression architecture 100 includes an input element 110 and a self-adjusting incremental model compression 130, the input element 110 includes a DNN pre-trained model 112, such as AlexNet, VGG16, RestNet, MobileNet, GoogLeNet, Sufflenet, ResNext, Xception Network, etc., a User training and validation data 114 and a user performance indicator 116 . The self-tuning incremental model compression 130 analyzes the sparsity of the pre-trained model 112 , automatically prunes and quantizes network redundancy, and removes unnecessary layers in the model compression architecture 100 . Meanwhile, the proposed technique can reuse the parameters...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method of compressing a pre-trained deep neural network model includes inputting the pre-trained deep neural network model as a candidate model. The candidate model is compressed by increasing sparsity of the candidate, removing at least one batch normalization layer present in the candidate model, and quantizing all remaining weights into fixed-point representation to form a compressed model. Accuracy of the compressed model is then determined utilizing an end-user training and validation data set. Compression of the candidate model is repeated when the accuracy improves. Via the method, the size of the DNN is reduced, implementation requirements, such as memory, hardware, or processing necessities, are also reduced. Inference can be achieved with much increased speed and much decreasedcomputational requirements. Lastly, due to the disclosed method of compression, these benefits occur while guaranteeing accuracy during inference.

Description

technical field [0001] The present invention relates to a method for compressing a large-scale deep neural network, in particular to a method for compressing a pre-trained deep neural network model. Background technique [0002] Large-scale deep neural networks have achieved remarkable results in the fields of computer vision, image recognition and speech recognition. With massive data sets, powerful computing power and memory, deep learning models are becoming larger and more complex. However, these huge models cannot be installed on end-user devices with low memory and limited computing power, such as mobile phones and embedded devices. Furthermore, developing custom deep learning models for end-user devices cannot learn from scratch due to limited data sets. Contents of the invention [0003] The present invention discloses a method for compressing a pre-trained deep neural network model. The deep neural network model includes multiple layers, and each layer of the mu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/063G06N3/08
CPCG06N3/063G06N3/082G06N3/045G06N20/10
Inventor 谢必克苏俊杰伍捷张博栋刘峻诚
Owner KNERON TAIWAN CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products