Adaptive quantization method adaptive to neural network accelerator running on FPGA

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An adaptive quantization and neural network technology, applied in the field of adaptive quantization and neural network accelerators, can solve problems such as the inability to ensure the correctness of calculation results and the overflow problem, and achieve easy deployment and implementation, save storage space and computing resources, The effect of ensuring correctness

Pending Publication Date: 2022-02-01

XIAN MICROELECTRONICS TECH INST

View PDF0 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The purpose of the present invention is to overcome the existing quantization method that does not consider the overflow problem that may occur during the integer data operation process when deployed on the FPGA, and cannot ensure the correctness of the calculation results, and provides a method that is adapted to run on FPGA Adaptive Quantization Methods for Neural Network Accelerators on

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0030] Taking a neural network model with only one Convolution layer as an example, the present invention will be further described in detail.

[0031] (1) If the invention is not adopted, that is, the overflow problem that may occur during the integer data operation process is not considered when deployed on the FPGA, then the calculation process and results are deduced as follows:

[0032] The parameters of the Convolution layer in this neural network model are pad=0, stride=1, the input size is 1×3×3, and its value is The filter size is 2×2 and its value is The bias is 1.213093. The process of deploying the neural network model on the neural network accelerator is as follows:

[0033] (a) The quantization coefficient is pre-calculated by the quantization software

[0034] Take the quantization bit width as 8 as an example. The quantization parameter th is obtained by the quantization software in = 11.32375,th w =8.134685,th iout = 153.37865. The quantization coeff...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an adaptive quantization method adaptive to a neural network accelerator running on an FPGA, and belongs to the field of neural networks. The overflow degree of the neural network accelerator during calculation is automatically pre-judged according to the actual bit width in the calculation process of the neural network accelerator, the quantization parameters are adaptively adjusted according to the overflow degree, the problem of data overflow in the calculation process of a neural network algorithm on an FPGA is avoided, and therefore the correctness of a neural network model result is guaranteed. According to the adaptive quantification method, quantification operation is combined with resource planning of neural network accelerator hardware, the correctness of a result when the neural network accelerator deploys an algorithm is ensured, the model scale can be effectively reduced on the premise that the model precision and the execution efficiency are not lost, the method is easy to deploy and implement under the condition that resources are limited, storage space and computing resources are saved, and the method has important research significance and application value.

Description

technical field [0001] The invention belongs to the field of neural networks, in particular to an adaptive quantization method adapted to a neural network accelerator running on FPGA. Background technique [0002] In order to achieve high-speed and low-power calculations, neural network accelerators generally support low numerical precision, such as 8-bit or 6-bit fixed-point calculations, while the original numerical precision of neural network models is generally 32-bit floating-point numbers. Therefore, when a neural network accelerator deploys a neural network algorithm, it is necessary to automatically compress the neural network model into an 8-bit or 6-bit integer network through a quantization operation. [0003] The foreign giant Nvidia dominates the neural network accelerator market based on its complete GPU+CUDA ecology, and maps floating-point data to integer data. However, its products are expensive, do not have independent controllability, and GPU computing per...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/02

CPCG06N3/02

Inventor 魏璐马钟王月娇杨超杰

Owner XIAN MICROELECTRONICS TECH INST

Adaptive quantization method adaptive to neural network accelerator running on FPGA

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology