Depth model compression method for edge device multi-layer shared codebook vector quantization

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A codebook vector, edge device technology, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve the problem of increasing the running time of vector quantization models, and achieve the goal of maintaining accuracy, reducing quantization loss, and reducing model size. Effect

Pending Publication Date: 2022-07-22

SHENYANG INSTITUTE OF CHEMICAL TECHNOLOGY

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The purpose of the present invention is to provide a deep model compression method for multi-layer shared codebook vector quantization of edge devices. The present invention combines two methods of channel pruning and vector quantization to reduce the number of channels in each layer of the model through channel pruning, effectively solving the problem of The vector quantization model increases the running time in the operation, and can further compress the storage space of the model parameters

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0044] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, but not all, embodiments of the present invention.

[0045] This embodiment discloses a deep model compression method for multi-layer shared codebook vector quantization for edge devices, such as figure 1 shown. For ease of understanding, this example uses the Pytorch deep learning framework to build and train a six-class neural network model with resnet18 as the backbone network. The original model size is 42.77MB, and its six-class accuracy rate is 98.96%. Compressing this network model involves the following 5 steps:

[0046] S1, sparse training: by figure 2 The sparse training shown is a part of channel pruning, and the trainable parameter γ of the BN layer in the network model is selected as the channel evaluation factor, where the output formula of the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a deep neural network model compression method, in particular to a depth model compression method for edge device multi-layer shared codebook vector quantization, which comprises five steps of sparse training, channel pruning, weight arrangement optimization, group vector quantization and codebook fine tuning. And sparse training: carrying out sparse training on each channel evaluation factor. And channel pruning: pruning the channels with low importance from the model. And weight arrangement optimization: carrying out weight optimization arrangement on the network model with a small size after channel pruning. And group vector quantization: implementing the pruned model to generate a lightweight network model. And codebook fine tuning: the lightweight model recovers the precision of the model. The method is used for compressing a network model with a large cloud size and a complex structure to obtain a lightweight model convenient for edge deployment, and the model deployment requirement under the condition that edge equipment computing power resources and storage resources are limited is met. Requirements of the model on storage space and computing power are effectively reduced, and the utilization rate of computing power resources and storage resources of edge devices is maximized.

Description

technical field [0001] The invention relates to a network model compression method, in particular to a depth model compression method oriented to edge device multi-layer shared codebook vector quantization. Background technique [0002] Deep neural network models have been widely used in computer vision, speech recognition, natural language processing, autonomous driving and other fields, and have huge application prospects in edge devices of mobile terminals and embedded systems. As more and more artificial intelligence solutions enter the stage of implementation, due to the requirements for model calculation speed and network transmission speed in some scenarios, the emergence of edge computing provides a solution to the above problems, which directly deploys deep models in In the edge device, the edge device directly obtains the calculation result after receiving the data, without relying on the cloud environment. But running deep models requires powerful computing power...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/08G06N3/04

CPCG06N3/082G06N3/045

Inventor 黄明忠刘研赵立杰王国刚

Owner SHENYANG INSTITUTE OF CHEMICAL TECHNOLOGY

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Depth model compression method for edge device multi-layer shared codebook vector quantization

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology