A convolutional neural network accelerator based on calculation optimization of an FPGA

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and accelerator technology, applied in the field of convolutional neural network accelerator hardware structure, can solve the problem of large amount of redundant calculation, achieve high computing performance, reduce reading, and improve real-time performance.

Pending Publication Date: 2019-04-09

SOUTHEAST UNIV +2

View PDF6 Cites 43 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] Based on the foregoing analysis, there is a problem of excessive redundant calculations in convolution calculations in the prior art, and this case arises from this

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] The technical solutions and beneficial effects of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0030] Such as figure 1 As shown, the hardware structure of the convolutional neural network accelerator designed for the present invention, the size of the PE array is 16*16, the size of the convolution kernel is 3*3, and the step size of the convolution kernel is 1 as an example. Its working method is as follows:

[0031] The PC caches the data partitions in the external memory DDR through the PCI-E interface. The data cache reads the feature map data through the AXI4 bus interface and caches them in three feature map sub-buffers by row. The input index value is cached in the feature map in the same way. Image subbuffer. The weight data read through the AXI4 bus interface is sequentially cached in 16 convolution kernel sub-buffers, and the weight index value is cached in the convolution kernel sub-buffers in the sa...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a convolutional neural network accelerator based on calculation optimization of an FPGA. The convolutional neural network accelerator comprises an AXI4 bus interface, a data cache region, a pre-fetched data region, a result cache region, a state controller and a PE array. The data cache region is used for caching feature map data, convolution kernel data and index values read from an external memory DDR through an AXI4 bus interface; The pre-fetched data area is used for pre-fetching feature map data needing to be input into the PE array in parallel from the feature mapsub-cache area; The result cache region is used for caching a calculation result of each row of PE; The state controller is used for controlling the working state of the accelerator to realize conversion between the working states; And the PE array is used for reading the data in the pre-fetched data area and the convolution kernel sub-cache area to carry out convolution operation. The accelerator utilizes the characteristics of parameter sparsity, repeated weight data and an activation function Relu to end redundant calculation in advance, so that the calculation amount is reduced, and the energy consumption is reduced by reducing the access memory frequency.

Description

technical field [0001] The invention belongs to the field of electronic information and deep learning, in particular to a computing-optimized convolutional neural network accelerator hardware structure based on FPGA (Filed Programmable Gate Array). Background technique [0002] In recent years, the use of deep neural networks has grown rapidly and has had a significant impact on the world's economic and social activities. Deep convolutional neural network technology has received widespread attention in many machine learning fields, including speech recognition, natural language processing, and intelligent image processing. Especially in the field of image recognition, deep convolutional neural networks have achieved some remarkable results. In these domains, deep convolutional neural networks are able to achieve superhuman accuracy. The excellence of deep convolutional neural networks stems from its ability to extract high-level features from raw data after performing stati...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06N3/063

CPCG06N3/063G06N3/045Y02D10/00

Inventor 陆生礼庞伟舒程昊范雪梅吴成路邹涛

Owner SOUTHEAST UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A convolutional neural network accelerator based on calculation optimization of an FPGA

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology