A deep convolutional neural network model adaptive quantization method based on modulus length clustering

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A neural network model and adaptive quantization technology, applied in biological neural network models, neural architectures, etc., can solve problems such as limited storage resources and computing resources, network performance degradation, etc., to reduce the selection of unnecessary clustering points, The effect of fast clustering and reduced complexity

Active Publication Date: 2019-04-16

BEIHANG UNIV

View PDF4 Cites 13 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although some of these algorithms can also use shift operations instead of multiplication, their optimization goals (maximum probability criterion) and optimization methods (L1 and L2 regularization) usually cause the neural network parameters to present a centrally symmetrical non-uniform distribution, which will lead to network performance drops more

Although FPGA is not inferior to GPU in parallel computing capability, it is limited by its storage resources and computing resources.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0034] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0035] See attached figure 1 , is a flow chart of the adaptive quantization method for deep convolutional neural network models based on modular length clustering. The design and implementation of the adaptive quantization method for deep convolutional neural network models of the present invention are mainly divided into three parts: preprocessing of network model parameters , Group quantization of network model parameters and decomposition of quantized value...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a deep convolutional neural network model adaptive quantization method, and designs a deep convolutional deep network low-bit quantization algorithm suitable for FPGA calculation, which mainly comprises preprocessing of network model parameters and a grouping adaptive quantization method of a parameter set. the method includes: acquiring dynamic thresholds to perform coarse-grained cutting on the original parameters of the model; constructing an initial clustering center point set suitable for FPGA shift calculation; grouping and clustering the preprocessed model parameters based on a mode length minimization method; finally, overlaying the clustering center point set with the non-null parameter class, achieving self-adaptive low-bit quantization of different networks through optimization; the quantization algorithm is moderate in complexity and quite conforms to the calculation characteristics of the FPGA, hardware resource consumption on the FPGA is reduced, and the model reasoning speed is increased while the model reasoning precision is guaranteed.

Description

technical field [0001] The invention relates to the technical field of deep network model compression, in particular to an adaptive quantization method of a deep convolutional neural network model based on module length clustering. Background technique [0002] With the rapid development of deep learning technology, deep neural networks have achieved leapfrog breakthroughs in computer vision, speech recognition, natural processing and other fields. However, deep learning algorithms have not been widely used in the fields of industry, manufacturing, aerospace and navigation. One of the reasons is that the model of the deep learning network is huge and the amount of calculation is huge. The weight file of a CNN network can easily be hundreds of megabytes, such as AlexNet With 61M parameters and 249MB memory, the memory capacity of complex VGG16 and VGG19 has exceeded 500MB, which means larger storage capacity and more floating-point operations are required. Due to the limited...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04

CPCG06N3/045

Inventor 姜宏旭李晓宾李浩韩琪黄双喜

Owner BEIHANG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A deep convolutional neural network model adaptive quantization method based on modulus length clustering

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology