Hardware accelerator of convolutional neural network based on parallel multiplexing and parallel multiplexing method

A technology of convolutional neural network and hardware accelerator, which is applied in the field of artificial intelligence hardware design, can solve problems such as high computing parallelism, low hardware complexity, and large waste of resources, so as to improve network multiplexing, improve flexibility, shorten the effect of time

Pending Publication Date: 2022-07-22
HEFEI UNIV OF TECH
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] The present invention is to solve the shortcomings of the above-mentioned prior art, and proposes a hardware accelerator and a parallel multiplexing method based on a parallel multiplexing convolutional neural network. The method of multiplexing simplifies large-scale convolution calculations in neural networks, thereby realizing convolutional neural network calculations with high computing parallelism, high data reuse, and low hardware complexity, and improving hardware to achieve large-scale convolution calculations. and waste of resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hardware accelerator of convolutional neural network based on parallel multiplexing and parallel multiplexing method
  • Hardware accelerator of convolutional neural network based on parallel multiplexing and parallel multiplexing method
  • Hardware accelerator of convolutional neural network based on parallel multiplexing and parallel multiplexing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0091] In this embodiment, the convolutional neural network includes K layers of convolution layers, K+1 layers of activation layers, K layers of pooling layers, and 2 layers of fully connected layers; and the convolutional neural network is trained based on the MNIST handwritten digit set, get the weight parameter;

[0092] The convolutional neural network used in this embodiment is a handwritten digit recognition network, such as figure 1 As shown, its structure includes: 2 layers of convolution layers, 3 layers of activation layers, 2 layers of pooling layers, 2 layers of fully connected layers, using a 3 × 3 convolution kernel; except for the activation layer immediately after the convolution layer It has no effect on the feature map size. The other six-layer networks are arranged as follows: the first layer is a convolutional layer, the input feature size is 28×28×1, the weight matrix size is 3×3×1×4, and the length of the bias vector is is 4, the output feature size is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a hardware accelerator of a convolutional neural network based on parallel multiplexing and a parallel multiplexing method. The hardware accelerator comprises a parameter storage module, an REG-FIFO module, a counting control module, an input multiplexing convolution operation module, an activation module and a pooling layer module. Wherein the parameter storage module is responsible for pre-storing picture parameters and trained weight parameters; the REG-FIFO module is responsible for generating an input matrix matched with the convolution kernel and reading matrix data; the counting control module is responsible for clock cycle counting and controls input and output of the REG-FIFO module according to the clock cycle counting. The input multiplexing convolution operation module is responsible for convolution operation of a convolution layer and a full connection layer; the activation module is responsible for output activation operation of the convolution layer and the full connection layer; and the pooling layer module is responsible for pooling operation output by the activated convolutional layer. The invention aims to realize convolutional neural network calculation with high operation parallelism, high data multiplexing and low hardware complexity.

Description

technical field [0001] The invention belongs to the field of artificial intelligence hardware design, and in particular relates to a method for implementing a convolutional neural network parallel multiplexing network computing accelerator. Background technique [0002] Convolutional Neural Network (CNN) is a popular direction in the field of Deep Learning (DL) in recent years. However, with the increasing data volume of the CNN model and the continuous improvement of computational input and accuracy requirements, the accelerated processing of convolutional neural networks has become a challenge. [0003] In general, in order to speed up the CNN algorithm, the following two characteristics are often used: (1) Sparse connection: The connection of the internal neurons in the calculation process is changed to a form of non-full connection. From the perspective of CNN visualization, it is the output feature. The corresponding node of a certain response in the graph is no longer...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/063G06N3/04G06N3/08
CPCG06N3/063G06N3/08G06N3/045
Inventor 杜嘉程林木森王琦崔丰麒黄楚盛王超贾忱皓李明轩吴共庆杜高明孙晓胡竟为卢帅勇
Owner HEFEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products