Method for accelerating convolution neutral network hardware and AXI bus IP core thereof

A technology of convolutional neural network and hardware acceleration, which is applied in the field of hardware acceleration of convolutional neural network, can solve the problems of non-modification, large amount of funds and human resources, etc.

Active Publication Date: 2015-09-16
NAT UNIV OF DEFENSE TECH
View PDF3 Cites 250 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

ASIC casting requires a lot of capital and human resources, and cannot be modified

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for accelerating convolution neutral network hardware and AXI bus IP core thereof
  • Method for accelerating convolution neutral network hardware and AXI bus IP core thereof
  • Method for accelerating convolution neutral network hardware and AXI bus IP core thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] Such as figure 1 As shown, the steps of the convolutional neural network hardware acceleration method of this embodiment include:

[0070] 1) Arrange the input feature map of the convolution operation in advance to form a matrix A, arrange the convolution kernels corresponding to the output feature map of the convolution operation to form a matrix B, and convert the convolution operation of the convolutional layer of the convolutional neural network into m rows Matrix multiplication of matrix A with K columns and matrix B with K rows and n columns;

[0071] 2) The matrix result C of the matrix multiplication operation is divided into m rows and n columns of matrix sub-blocks;

[0072] 3) Start the matrix multiplier connected to the main processor to calculate all matrix sub-blocks; when calculating the matrix sub-blocks, the matrix multiplier generates data requests in the form of matrix coordinates (Bx, By) in a data-driven manner , the matrix coordinates (Bx, By) ar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for accelerating convolution neutral network hardware and an AXI bus IP core thereof. The method comprises the first step of performing operation and converting a convolution layer into matrix multiplication of a matrix A with m lines and K columns and a matrix B with K lines and n columns; the second step of dividing the matrix result into matrix subblocks with m lines and n columns; the third step of starting a matrix multiplier to prefetch the operation number of the matrix subblocks; and the fourth step of causing the matrix multiplier to execute the calculation of the matrix subblocks and writing the result back to a main memory. The IP core comprises an AXI bus interface module, a prefetching unit, a flow mapper and a matrix multiplier. The matrix multiplier comprises a chain type DMA and a processing unit array, the processing unit array is composed of a plurality of processing units through chain structure arrangement, and the processing unit of a chain head is connected with the chain type DMA. The method can support various convolution neutral network structures and has the advantages of high calculation efficiency and performance, less requirements for on-chip storage resources and off-chip storage bandwidth, small in communication overhead, convenience in unit component upgrading and improvement and good universality.

Description

technical field [0001] The invention relates to a hardware acceleration technology of a convolutional neural network, in particular to a hardware acceleration method of a convolutional neural network and an AXI bus IP core thereof. Background technique [0002] The core challenge of the next generation of smart device processors is to be able to perceive and understand the human world to provide an ecological environment that enhances user experience and connects user preferences, and can interact with users similar to humans. Convolution Neural Network (CNN) is one of the perception models representing the most advanced development level at present. This model can parse the original input data into symbols layer by layer and extract complex multi-layer combination features. Machine vision and auditory systems have achieved great success and wide application. In 2013, MIT Technology Review magazine ranked deep learning represented by convolutional neural network as the top t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F15/16G06F15/17G06F9/50G06F13/42
Inventor 文梅乔寓然杨乾明沈俊忠肖涛王自伟张春元苏华友陈照云
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products