A low-bit efficient deep convolutional neural network hardware acceleration design method, module and system based on logarithmic quantization

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep convolution and neural network technology, applied in the field of artificial neural network hardware implementation, can solve problems such as high computational complexity, large area and energy, and large multiplier hardware complexity, so as to reduce hardware complexity and simplify the design method , reduce the effect of repeated reading

Active Publication Date: 2022-04-12

SOUTHEAST UNIV

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, a fatal shortcoming of the convolutional neural network is that its computational complexity is extremely high, and it requires a huge amount of calculation. It is difficult to achieve real-time calculations on some mobile systems and embedded devices, and convolution operations account for 10% of the total amount of calculations. more than 90 percent

Traditional computers based on serial architectures are difficult to meet the above requirements, so realizing fast convolutional neural network operations with lower hardware power consumption and hardware area has become one of the difficulties

[0004] The direct convolution operation is implemented by introducing a fixed-point or floating-point multiplier based on the multiply-add module, but the hardware complexity of the multiplier is large, and it will consume a lot of area and energy.

In a large-scale neural network, a large number of multiply-add devices are required, and the cost of convolution operations will become very high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0028] The technical solution of the present invention will be further introduced below in combination with specific embodiments.

[0029] This specific embodiment discloses a low-bit high-efficiency deep convolutional neural network hardware acceleration design method based on logarithmic quantization, including the following steps:

[0030] S1: Realize low-bit and high-precision non-uniform fixed-point quantization based on the logarithmic domain, and use multiple quantization codebooks to quantize the full-precision pre-trained neural network model;

[0031] S2: The range of quantization is controlled by introducing offset shift parameters. In the case of extremely low-bit non-uniform quantization, the algorithm for adaptively finding the optimal quantization strategy compensates for quantization errors.

[0032] Usually, in order to facilitate the simplification of hardware complexity, a certain bit of fixed-point number is uniformly quantized for full-precision floating-p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a low-bit high-efficiency deep convolutional neural network hardware acceleration design method based on logarithmic quantization, which includes the following steps: S1: Realize low-bit and high-precision non-uniform fixed-point quantization based on logarithmic domain, using multiple quantization The codebook quantizes the full-precision pre-trained neural network model; S2: the range of quantization is controlled by introducing offset shift parameters, and an algorithm for adaptively finding the optimal quantization strategy in the case of extremely low-bit non-uniform quantization, Compensate for quantization errors. The invention also discloses a one-dimensional and two-dimensional pulse array processing module and system using the method. The invention can effectively reduce hardware complexity and power consumption.

Description

technical field [0001] The invention relates to artificial neural network hardware implementation technology, in particular to a logarithmic quantization-based low-bit high-efficiency deep convolutional neural network hardware acceleration design method, module and system. Background technique [0002] Convolutional Neural Network (CNN) is an important mathematical model in Deep Learning (DL), which has a powerful ability to extract hidden features of high-dimensional data. In recent years, it has been used in: target recognition, image In many fields such as classification, drug discovery, natural language processing, and Go, major breakthroughs have been made and the performance of the original system has been greatly improved. Therefore, deep convolutional neural networks have been widely studied by scholars all over the world and widely deployed in commercial applications. [0003] A convolutional neural network with deeper layers and larger parameter scale tends to hav...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06N3/04G06N3/08

CPCG06N3/08G06N3/045

Inventor 张川徐炜鸿尤肖虎

Owner SOUTHEAST UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A low-bit efficient deep convolutional neural network hardware acceleration design method, module and system based on logarithmic quantization

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology