Low-bit efficient deep convolutional neural network hardware acceleration design method based on logarithm quantization, and module and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep convolution and neural network technology, applied in the field of artificial neural network hardware implementation, can solve the problems of high computational complexity, large area and energy, and high multiplier hardware complexity, so as to reduce hardware complexity and simplify the design method. , the effect of architectural rules easy

Active Publication Date: 2018-09-04

SOUTHEAST UNIV

View PDF2 Cites 12 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, a fatal shortcoming of the convolutional neural network is that its computational complexity is extremely high, and it requires a huge amount of calculation. It is difficult to achieve real-time calculations on some mobile systems and embedded devices, and convolution operations account for 10% of the total amount of calculations. more than 90 percent

Traditional computers based on serial architectures are difficult to meet the above requirements, so realizing fast convolutional neural network operations with lower hardware power consumption and hardware area has become one of the difficulties

[0004] The direct convolution operation is implemented by introducing a fixed-point or floating-point multiplier based on the multiply-add module, but the hardware complexity of the multiplier is large, and it will consume a lot of area and energy.

In a large-scale neural network, a large number of multiply-add devices are required, and the cost of convolution operations will become very high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0028] The technical solution of the present invention will be further introduced below in combination with specific embodiments.

[0029] This specific embodiment discloses a low-bit high-efficiency deep convolutional neural network hardware acceleration design method based on logarithmic quantization, including the following steps:

[0030] S1: Realize low-bit and high-precision non-uniform fixed-point quantization based on the logarithmic domain, and use multiple quantization codebooks to quantize the full-precision pre-trained neural network model;

[0031] S2: The range of quantization is controlled by introducing offset shift parameters. In the case of extremely low-bit non-uniform quantization, the algorithm for adaptively finding the optimal quantization strategy compensates for quantization errors.

[0032] Usually, in order to facilitate the simplification of hardware complexity, a certain bit of fixed-point number is uniformly quantized for full-precision floating-p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The present invention discloses a low-bit efficient deep convolutional neural network hardware acceleration design method based on logarithm quantization. The method comprises the following steps: S1:implementing non-uniform fixed-point quantization of low-bit high-precision based on the logarithmic domain, and using multiple quantization codebooks to quantize the full-precision pre-trained neural network model; and S2: controlling the qualified range by introducing the offset shift parameter, and in the case of extremely low bit non-uniform quantization, adaptively searching the algorithm ofthe optimal quantization strategy to compensate for the quantization error. The present invention also discloses a one-dimensional and two-dimensional pulsation arrays processing module and system byusing the method. According to the technical scheme of the present invention, hardware complexity and power consumption can be effectively reduced.

Description

technical field [0001] The invention relates to artificial neural network hardware implementation technology, in particular to a logarithmic quantization-based low-bit high-efficiency deep convolutional neural network hardware acceleration design method, module and system. Background technique [0002] Convolutional Neural Network (CNN) is an important mathematical model in Deep Learning (DL), which has a powerful ability to extract hidden features of high-dimensional data. In recent years, it has been used in: target recognition, image In many fields such as classification, drug discovery, natural language processing, and Go, major breakthroughs have been made and the performance of the original system has been greatly improved. Therefore, deep convolutional neural networks have been widely studied by scholars all over the world and widely deployed in commercial applications. [0003] A convolutional neural network with deeper layers and larger parameter scale tends to hav...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06N3/08

CPCG06N3/08G06N3/045

Inventor 张川徐炜鸿尤肖虎

Owner SOUTHEAST UNIV

Low-bit efficient deep convolutional neural network hardware acceleration design method based on logarithm quantization, and module and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology