CNN acceleration calculation method and system based on low-precision floating-point number data representation form

A technology of expression and calculation method, applied in the field of deep convolutional neural network quantization, which can solve problems such as low acceleration performance

Active Publication Date: 2020-02-28
深圳市比昂芯科技有限公司
View PDF9 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Although the current technology improves the quantization and improves the quantization accuracy, there are still several limitations: 1) For quantized deep convolutional neural networks (the number of convolutional layers / fully connected layers exceeds 100 layers), retraining is required to Guaranteed accuracy; 2) Quantization needs to use 16-bit floating-point numbers or 8-bit specific points to ensure accuracy; 3) Under the premise of not using retraining and ensuring accuracy, the current technology can only achieve two at most in one DSP. multiplication operations, resulting in lower acceleration performance on the FPGA

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • CNN acceleration calculation method and system based on low-precision floating-point number data representation form
  • CNN acceleration calculation method and system based on low-precision floating-point number data representation form
  • CNN acceleration calculation method and system based on low-precision floating-point number data representation form

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0059] This embodiment provides a CNN acceleration calculation method and system based on low-precision floating-point data representation, using the low-precision floating-point representation MaEb, without retraining, to ensure the accuracy of the quantized convolutional neural network rate, and perform low-precision floating-point multiplication through MaEb floating-point numbers, and realize N through DSP m A MaEb floating-point multiplier to improve the acceleration performance of custom circuits or non-custom circuits, as follows:

[0060] A CNN acceleration calculation method based on low-precision floating-point number data representation, including the following steps:

[0061] The central control module generates a control signal to arbitrate the floating-point function module and the storage system;

[0062] The floating-point function module receives the input activation value and weight from the storage system according to the control signal, and distributes the...

Embodiment 2

[0117] Based on Embodiment 1, a system includes customized circuits or non-customized circuits, the customized circuits include ASIC or SOC, and the non-customized circuits include FPGAs, such as figure 1 As shown, the customized circuit or non-customized circuit includes a floating-point function module, which is used to receive input activation values ​​and weights from the storage system according to control signals, and distribute input activation values ​​and weights to different processing units PE Parallel computing is quantized as the convolution of MaEb floating-point numbers through the representation of low-precision floating-point numbers, where 0

[0118] Storage system for caching input feature maps, weights and output feature maps;

[0119] The central control module is used for arbitrating the floating-point function module and the storage system after decoding the instruction into a control signal;

[0120] The floa...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a CNN acceleration calculation method and system based on a low-precision floating-point number data representation form, and relates to the field of CNN acceleration calculation. The acceleration calculation method comprises the following steps: a floating-point number function module receives an input activation value and a weight from a storage system according to a control signal, and distributes the input activation value and the weight to different processing units PE for convolution calculation to complete CNN acceleration calculation; wherein the convolution calculation comprises forward calculation of a convolution layer completed by performing dot product calculation on MaEb floating-point numbers quantized through a low-precision floating-point number representation form. By calculating and using a low-precision floating-point number representation form MaEb, the accuracy of the quantized CNN is ensured under the condition that retraining is not needed; Nm MaEb floating-point multipliers are realized through low-precision floating-point multiplication and a DSP (Digital Signal Processor), so that the acceleration performance of a customized circuit or a non-customized circuit is greatly improved under the condition of ensuring the accuracy, the customized circuit is ASIC (Application Specific Integrated Circuit) or SOC (System On Chip), and the non-customized circuit comprises an FPGA (Field Programmable Gate Array).

Description

technical field [0001] The invention relates to the field of deep convolutional neural network quantization, in particular to a CNN acceleration calculation method and system based on low-precision floating-point number data representation. Background technique [0002] In recent years, the application of AI (Artificial Intelligence, artificial intelligence) has penetrated into many aspects, such as face recognition, game battle, image processing, simulation, etc., although the processing accuracy has been improved, but because the neural network contains many layers and A large number of parameters require a very large calculation cost and storage space. In this regard, technicians have proposed a neural network compression processing scheme, that is, by changing the network structure or using quantization and approximation methods to reduce network parameters or storage space, and reduce network cost and storage space without greatly affecting neural network performance. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/063G06F7/483G06F7/57
CPCG06N3/063G06F7/483G06F7/57G06N3/045
Inventor 吴晨王铭宇徐世平
Owner 深圳市比昂芯科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products