CNN acceleration calculation method and system based on low-precision floating-point number data representation form

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of expression and calculation method, applied in the field of deep convolutional neural network quantization, which can solve problems such as low acceleration performance

Active Publication Date: 2020-02-28

深圳市比昂芯科技有限公司

View PDF9 Cites 13 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] Although the current technology improves the quantization and improves the quantization accuracy, there are still several limitations: 1) For quantized deep convolutional neural networks (the number of convolutional layers / fully connected layers exceeds 100 layers), retraining is required to Guaranteed accuracy; 2) Quantization needs to use 16-bit floating-point numbers or 8-bit specific points to ensure accuracy; 3) Under the premise of not using retraining and ensuring accuracy, the current technology can only achieve two at most in one DSP. multiplication operations, resulting in lower acceleration performance on the FPGA

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0059] This embodiment provides a CNN acceleration calculation method and system based on low-precision floating-point data representation, using the low-precision floating-point representation MaEb, without retraining, to ensure the accuracy of the quantized convolutional neural network rate, and perform low-precision floating-point multiplication through MaEb floating-point numbers, and realize N through DSP m A MaEb floating-point multiplier to improve the acceleration performance of custom circuits or non-custom circuits, as follows:

[0060] A CNN acceleration calculation method based on low-precision floating-point number data representation, including the following steps:

[0061] The central control module generates a control signal to arbitrate the floating-point function module and the storage system;

[0062] The floating-point function module receives the input activation value and weight from the storage system according to the control signal, and distributes the...

Embodiment 2

[0117] Based on Embodiment 1, a system includes customized circuits or non-customized circuits, the customized circuits include ASIC or SOC, and the non-customized circuits include FPGAs, such as figure 1 As shown, the customized circuit or non-customized circuit includes a floating-point function module, which is used to receive input activation values and weights from the storage system according to control signals, and distribute input activation values and weights to different processing units PE Parallel computing is quantized as the convolution of MaEb floating-point numbers through the representation of low-precision floating-point numbers, where 0<a+b≤31, a and b are both positive integers;

[0118] Storage system for caching input feature maps, weights and output feature maps;

[0119] The central control module is used for arbitrating the floating-point function module and the storage system after decoding the instruction into a control signal;

[0120] The floa...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a CNN acceleration calculation method and system based on a low-precision floating-point number data representation form, and relates to the field of CNN acceleration calculation. The acceleration calculation method comprises the following steps: a floating-point number function module receives an input activation value and a weight from a storage system according to a control signal, and distributes the input activation value and the weight to different processing units PE for convolution calculation to complete CNN acceleration calculation; wherein the convolution calculation comprises forward calculation of a convolution layer completed by performing dot product calculation on MaEb floating-point numbers quantized through a low-precision floating-point number representation form. By calculating and using a low-precision floating-point number representation form MaEb, the accuracy of the quantized CNN is ensured under the condition that retraining is not needed; Nm MaEb floating-point multipliers are realized through low-precision floating-point multiplication and a DSP (Digital Signal Processor), so that the acceleration performance of a customized circuit or a non-customized circuit is greatly improved under the condition of ensuring the accuracy, the customized circuit is ASIC (Application Specific Integrated Circuit) or SOC (System On Chip), and the non-customized circuit comprises an FPGA (Field Programmable Gate Array).

Description

technical field [0001] The invention relates to the field of deep convolutional neural network quantization, in particular to a CNN acceleration calculation method and system based on low-precision floating-point number data representation. Background technique [0002] In recent years, the application of AI (Artificial Intelligence, artificial intelligence) has penetrated into many aspects, such as face recognition, game battle, image processing, simulation, etc., although the processing accuracy has been improved, but because the neural network contains many layers and A large number of parameters require a very large calculation cost and storage space. In this regard, technicians have proposed a neural network compression processing scheme, that is, by changing the network structure or using quantization and approximation methods to reduce network parameters or storage space, and reduce network cost and storage space without greatly affecting neural network performance. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/04G06N3/063G06F7/483G06F7/57

CPCG06N3/063G06F7/483G06F7/57G06N3/045

Inventor 吴晨王铭宇徐世平

Owner 深圳市比昂芯科技有限公司

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

CNN acceleration calculation method and system based on low-precision floating-point number data representation form

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology