Unlock instant, AI-driven research and patent intelligence for your innovation.

Efficient re-configurable compute core for convolutional neural network

A convolutional and efficient technology, applied in biological neural network models, physical realization, etc.

Active Publication Date: 2018-06-01
南京风兴科技有限公司
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Most of the convolutional networks use convolution kernels of 3*3 or 5*5 sizes, and a small number of larger-sized convolution kernels are 7*7 and 11*11, and other sizes are also available. not used effectively

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Efficient re-configurable compute core for convolutional neural network
  • Efficient re-configurable compute core for convolutional neural network
  • Efficient re-configurable compute core for convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] Here is an introduction to the configuration of the RCC structure and its implementation in different modes. Its input and output interface names are the same as figure 2 One to one correspondence.

[0044] To enable 3*3 mode, set control signals {cs_3, cs_7, cs_11} to {1, 0, 0}. The two fast convolution modules implement three independent 3*3 convolution calculations respectively. At this time, the three independent 3*3 convolution input and output data streams completed by the first fast convolution module are shown in Table 1. The three sets of convolution input and output data patterns completed in the second fast convolution module are similar, and only need to replace the subscript a in Table 1 with b.

[0045]

[0046] Table 1. Input and output data flow of 3*3 mode

[0047] To enable 5*5 mode, set control signals {cs_3, cs_7, cs_11} to {0, 0, 0}. The two fast convolution modules realize two 6*6 convolution calculations in total, and realize 5*5 convolutio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention discloses an efficient re-configurable compute core (RCC) for a convolutional neural network. The structure can effectively achieve convolution calculation of convolution kernelsof 4 mainstream sizes and convolution kernels of all the sizes under 12*12 in a convolutional neural network through configuration, and can substantially reduce the complexity of convolution calculation. A hardware structure (FFIR) based on a rapid FIR algorithm is introduced, a 2 parallel FFIR structure is cascaded with a 3 parallel FFIR, a 6 parallel FFIR (6P-FFIR) is designed, and a compressoris used to optimize the 6P-FFIR. Based on the structure of the 6P-FFIR, an efficient RCC is designed. Compared to a traditional FIR filter, the efficient RCC for the convolutional neural network cansave 33%-47% of multiplication calculation while achieving the convolution calculation of the four mainstream sizes. The configuration can save a lot of hardware areas and power consumption, is very suitable for scenes with severe requirements on power consumption such as an internet of things, an embedded chip and the like, can be applicable in scenes of convolution calculation of many sizes andcan improve the effective throughput of the system.

Description

technical field [0001] The invention relates to the field of integrated circuits and machine learning, in particular to a method for efficiently realizing four sizes of 3*3, 5*5, 7*7 and 11*11 in a convolutional neural network, and can realize 12*12 and The hardware structure of a general-purpose convolutional neural network accelerator for convolution calculations of all other sizes below. Background technique [0002] Convolutional neural network (CNN) is currently one of the most studied and widely used machine learning algorithms. Convolution calculation is the part that consumes the most computing resources in CNN. Now most of the convolutional neural network models run on the cloud platform with CPU or GPU as the core. With the further progress and expansion of artificial intelligence technology, convolution The application requirements of neural networks in embedded systems and real-time systems that have strict requirements on hardware resources are also increasing,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/063
CPCG06N3/063
Inventor 王中风王昊楠林军
Owner 南京风兴科技有限公司