A low-bit efficient deep convolutional neural network hardware acceleration design method, module and system based on logarithmic quantization

A deep convolution and neural network technology, applied in the field of artificial neural network hardware implementation, can solve problems such as high computational complexity, large area and energy, and large multiplier hardware complexity, so as to reduce hardware complexity and simplify the design method , reduce the effect of repeated reading

Active Publication Date: 2022-04-12
SOUTHEAST UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, a fatal shortcoming of the convolutional neural network is that its computational complexity is extremely high, and it requires a huge amount of calculation. It is difficult to achieve real-time calculations on some mobile systems and embedded devices, and convolution operations account for 10% of the total amount of calculations. more than 90 percent
Traditional computers based on serial architectures are difficult to meet the above requirements, so realizing fast convolutional neural network operations with lower hardware power consumption and hardware area has become one of the difficulties
[0004] The direct convolution operation is implemented by introducing a fixed-point or floating-point multiplier based on the multiply-add module, but the hardware complexity of the multiplier is large, and it will consume a lot of area and energy.
In a large-scale neural network, a large number of multiply-add devices are required, and the cost of convolution operations will become very high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A low-bit efficient deep convolutional neural network hardware acceleration design method, module and system based on logarithmic quantization
  • A low-bit efficient deep convolutional neural network hardware acceleration design method, module and system based on logarithmic quantization
  • A low-bit efficient deep convolutional neural network hardware acceleration design method, module and system based on logarithmic quantization

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The technical solution of the present invention will be further introduced below in combination with specific embodiments.

[0029] This specific embodiment discloses a low-bit high-efficiency deep convolutional neural network hardware acceleration design method based on logarithmic quantization, including the following steps:

[0030] S1: Realize low-bit and high-precision non-uniform fixed-point quantization based on the logarithmic domain, and use multiple quantization codebooks to quantize the full-precision pre-trained neural network model;

[0031] S2: The range of quantization is controlled by introducing offset shift parameters. In the case of extremely low-bit non-uniform quantization, the algorithm for adaptively finding the optimal quantization strategy compensates for quantization errors.

[0032] Usually, in order to facilitate the simplification of hardware complexity, a certain bit of fixed-point number is uniformly quantized for full-precision floating-p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a low-bit high-efficiency deep convolutional neural network hardware acceleration design method based on logarithmic quantization, which includes the following steps: S1: Realize low-bit and high-precision non-uniform fixed-point quantization based on logarithmic domain, using multiple quantization The codebook quantizes the full-precision pre-trained neural network model; S2: the range of quantization is controlled by introducing offset shift parameters, and an algorithm for adaptively finding the optimal quantization strategy in the case of extremely low-bit non-uniform quantization, Compensate for quantization errors. The invention also discloses a one-dimensional and two-dimensional pulse array processing module and system using the method. The invention can effectively reduce hardware complexity and power consumption.

Description

technical field [0001] The invention relates to artificial neural network hardware implementation technology, in particular to a logarithmic quantization-based low-bit high-efficiency deep convolutional neural network hardware acceleration design method, module and system. Background technique [0002] Convolutional Neural Network (CNN) is an important mathematical model in Deep Learning (DL), which has a powerful ability to extract hidden features of high-dimensional data. In recent years, it has been used in: target recognition, image In many fields such as classification, drug discovery, natural language processing, and Go, major breakthroughs have been made and the performance of the original system has been greatly improved. Therefore, deep convolutional neural networks have been widely studied by scholars all over the world and widely deployed in commercial applications. [0003] A convolutional neural network with deeper layers and larger parameter scale tends to hav...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/04G06N3/08
CPCG06N3/08G06N3/045
Inventor 张川徐炜鸿尤肖虎
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products