A deep convolutional neural network model adaptive quantization method based on modulus length clustering

A neural network model and adaptive quantization technology, applied in biological neural network models, neural architectures, etc., can solve problems such as limited storage resources and computing resources, network performance degradation, etc., to reduce the selection of unnecessary clustering points, The effect of fast clustering and reduced complexity

Active Publication Date: 2019-04-16
BEIHANG UNIV
View PDF4 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although some of these algorithms can also use shift operations instead of multiplication, their optimization goals (maximum probability criterion) and optimization methods (L1 and L2 regularization) usually cause the neural network parameters to present a centrally symmetrical non-uniform distribution, which will lead to network performance drops more
Although FPGA is not inferior to GPU in parallel computing capability, it is limited by its storage resources and computing resources.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A deep convolutional neural network model adaptive quantization method based on modulus length clustering
  • A deep convolutional neural network model adaptive quantization method based on modulus length clustering
  • A deep convolutional neural network model adaptive quantization method based on modulus length clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0035] See attached figure 1 , is a flow chart of the adaptive quantization method for deep convolutional neural network models based on modular length clustering. The design and implementation of the adaptive quantization method for deep convolutional neural network models of the present invention are mainly divided into three parts: preprocessing of network model parameters , Group quantization of network model parameters and decomposition of quantized value...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a deep convolutional neural network model adaptive quantization method, and designs a deep convolutional deep network low-bit quantization algorithm suitable for FPGA calculation, which mainly comprises preprocessing of network model parameters and a grouping adaptive quantization method of a parameter set. the method includes: acquiring dynamic thresholds to perform coarse-grained cutting on the original parameters of the model; constructing an initial clustering center point set suitable for FPGA shift calculation; grouping and clustering the preprocessed model parameters based on a mode length minimization method; finally, overlaying the clustering center point set with the non-null parameter class, achieving self-adaptive low-bit quantization of different networks through optimization; the quantization algorithm is moderate in complexity and quite conforms to the calculation characteristics of the FPGA, hardware resource consumption on the FPGA is reduced, and the model reasoning speed is increased while the model reasoning precision is guaranteed.

Description

technical field [0001] The invention relates to the technical field of deep network model compression, in particular to an adaptive quantization method of a deep convolutional neural network model based on module length clustering. Background technique [0002] With the rapid development of deep learning technology, deep neural networks have achieved leapfrog breakthroughs in computer vision, speech recognition, natural processing and other fields. However, deep learning algorithms have not been widely used in the fields of industry, manufacturing, aerospace and navigation. One of the reasons is that the model of the deep learning network is huge and the amount of calculation is huge. The weight file of a CNN network can easily be hundreds of megabytes, such as AlexNet With 61M parameters and 249MB memory, the memory capacity of complex VGG16 and VGG19 has exceeded 500MB, which means larger storage capacity and more floating-point operations are required. Due to the limited...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/04
CPCG06N3/045
Inventor 姜宏旭李晓宾李浩韩琪黄双喜
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products