Depth model compression method for edge device multi-layer shared codebook vector quantization

A codebook vector, edge device technology, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve the problem of increasing the running time of vector quantization models, and achieve the goal of maintaining accuracy, reducing quantization loss, and reducing model size. Effect

Pending Publication Date: 2022-07-22
SHENYANG INSTITUTE OF CHEMICAL TECHNOLOGY
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a deep model compression method for multi-layer shared codebook vector quantization of edge devices. The present invention combines two methods of channel pruning and vector quantization to reduce the number of channels in each layer of the model through channel pruning, effectively solving the problem of The vector quantization model increases the running time in the operation, and can further compress the storage space of the model parameters

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Depth model compression method for edge device multi-layer shared codebook vector quantization
  • Depth model compression method for edge device multi-layer shared codebook vector quantization
  • Depth model compression method for edge device multi-layer shared codebook vector quantization

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, but not all, embodiments of the present invention.

[0045] This embodiment discloses a deep model compression method for multi-layer shared codebook vector quantization for edge devices, such as figure 1 shown. For ease of understanding, this example uses the Pytorch deep learning framework to build and train a six-class neural network model with resnet18 as the backbone network. The original model size is 42.77MB, and its six-class accuracy rate is 98.96%. Compressing this network model involves the following 5 steps:

[0046] S1, sparse training: by figure 2 The sparse training shown is a part of channel pruning, and the trainable parameter γ of the BN layer in the network model is selected as the channel evaluation factor, where the output formula of the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a deep neural network model compression method, in particular to a depth model compression method for edge device multi-layer shared codebook vector quantization, which comprises five steps of sparse training, channel pruning, weight arrangement optimization, group vector quantization and codebook fine tuning. And sparse training: carrying out sparse training on each channel evaluation factor. And channel pruning: pruning the channels with low importance from the model. And weight arrangement optimization: carrying out weight optimization arrangement on the network model with a small size after channel pruning. And group vector quantization: implementing the pruned model to generate a lightweight network model. And codebook fine tuning: the lightweight model recovers the precision of the model. The method is used for compressing a network model with a large cloud size and a complex structure to obtain a lightweight model convenient for edge deployment, and the model deployment requirement under the condition that edge equipment computing power resources and storage resources are limited is met. Requirements of the model on storage space and computing power are effectively reduced, and the utilization rate of computing power resources and storage resources of edge devices is maximized.

Description

technical field [0001] The invention relates to a network model compression method, in particular to a depth model compression method oriented to edge device multi-layer shared codebook vector quantization. Background technique [0002] Deep neural network models have been widely used in computer vision, speech recognition, natural language processing, autonomous driving and other fields, and have huge application prospects in edge devices of mobile terminals and embedded systems. As more and more artificial intelligence solutions enter the stage of implementation, due to the requirements for model calculation speed and network transmission speed in some scenarios, the emergence of edge computing provides a solution to the above problems, which directly deploys deep models in In the edge device, the edge device directly obtains the calculation result after receiving the data, without relying on the cloud environment. But running deep models requires powerful computing power...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/08G06N3/04
CPCG06N3/082G06N3/045
Inventor 黄明忠刘研赵立杰王国刚
Owner SHENYANG INSTITUTE OF CHEMICAL TECHNOLOGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products