CNN convolution kernel hardware design method based on cascade form

A design method and core hardware technology, applied in the field of deep learning, can solve the problems of increasing the complexity of the network, the difficulty of quickly implementing CNN operations, and the increase in the demand for computing volume, so as to reduce complexity, maintain flexible configurability, and increase speed Effect

Inactive Publication Date: 2017-06-13
SOUTHEAST UNIV
View PDF0 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, doing so also increases the overall complexity of the network, greatly increasing the demand for computation
In CNN calculation, the convolution operation accounts for more than 90% of the total calculation amount. Due to its structural characteristics, the traditional von Neumann computer architecture is difficult to quickly realize the CNN operation.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • CNN convolution kernel hardware design method based on cascade form
  • CNN convolution kernel hardware design method based on cascade form
  • CNN convolution kernel hardware design method based on cascade form

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The technical solution of the present invention will be further introduced below in conjunction with the accompanying drawings and specific embodiments.

[0028] The calculation of convolutional neural network convolution can be expressed by the following formula:

[0029]

[0030] Therefore, for L×L linear convolution, the fast convolution algorithm can be expressed in the form of matrix multiplication as:

[0031] S 2L-1 =Q L h L P L x L

[0032] x L and S 2L-1 Input and output vectors respectively, in order to improve the throughput, the parallel filter can be transposed by the linear convolution formula and expressed as:

[0033]

[0034] Considering that each matrix can be obtained by using the basic convolution matrix, the basic convolution matrix is ​​generated first, and then the basic convolution matrix is ​​transformed to obtain each matrix. Based on this idea, a cascade-based CNN convolution kernel hardware design method is proposed, including ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a CNN convolution kernel hardware design method based on a cascade form. The method comprises the first step of defining a convolution fundamental matrix, and adopting hardware implementation to obtain a convolution fundamental module; the second step of presenting linear convolution using a fast convolution algorithm in a form of matrix multiplication, and conducting cascading on the convolution fundamental module according to the presenting mode to form a cascade convolution module; the third step of adding an addition matrix which is achieved by a summator into the cascade convolution module to achieve parallel processing, addition operation of an offset value and two-dimensional convolution sum so that the final fast convolution kernel can be obtained. According to the CNN convolution kernel hardware design method based on the cascade form, compared with a traditional two-dimensional convolution kernel, the hardware demand is lowered, and the system velocity is increased at the same time; secondly, the shortcoming that when fast convolution kernels are different in size, a hardware architecture needs to be redesigned is avoided.

Description

technical field [0001] The invention relates to the technical field of deep learning, in particular to a cascade-based CNN convolution kernel hardware design method. Background technique [0002] Over the past few years, deep learning has excelled at solving many problems such as visual recognition, speech recognition, and natural language processing. Among the different types of neural networks, convolutional neural networks are the most intensively studied. In the early days, due to the lack of training data and computing power, it was difficult to train high-performance convolutional neural networks without overfitting. [0003] There has been a plethora of state-of-the-art results in Convolutional Neural Network research recently. Since 2006, many methods have been designed to overcome the difficulty of training deep convolutional neural networks (CNN). Among them, the most famous is Krizhevsky et al. proposed a classic CNN structure - AlexNet, and made a major breakt...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/063G06N3/04G06N3/08
CPCG06N3/063G06N3/08G06N3/045
Inventor 张川徐炜鸿尤肖虎
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products