Hardware accelerator of convolutional neural network based on parallel multiplexing and parallel multiplexing method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of convolutional neural network and hardware accelerator, which is applied in the field of artificial intelligence hardware design, can solve problems such as high computing parallelism, low hardware complexity, and large waste of resources, so as to improve network multiplexing, improve flexibility, shorten the effect of time

Pending Publication Date: 2022-07-22

HEFEI UNIV OF TECH

View PDF0 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0010] The present invention is to solve the shortcomings of the above-mentioned prior art, and proposes a hardware accelerator and a parallel multiplexing method based on a parallel multiplexing convolutional neural network. The method of multiplexing simplifies large-scale convolution calculations in neural networks, thereby realizing convolutional neural network calculations with high computing parallelism, high data reuse, and low hardware complexity, and improving hardware to achieve large-scale convolution calculations. and waste of resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0091] In this embodiment, the convolutional neural network includes K layers of convolution layers, K+1 layers of activation layers, K layers of pooling layers, and 2 layers of fully connected layers; and the convolutional neural network is trained based on the MNIST handwritten digit set, get the weight parameter;

[0092] The convolutional neural network used in this embodiment is a handwritten digit recognition network, such as figure 1 As shown, its structure includes: 2 layers of convolution layers, 3 layers of activation layers, 2 layers of pooling layers, 2 layers of fully connected layers, using a 3 × 3 convolution kernel; except for the activation layer immediately after the convolution layer It has no effect on the feature map size. The other six-layer networks are arranged as follows: the first layer is a convolutional layer, the input feature size is 28×28×1, the weight matrix size is 3×3×1×4, and the length of the bias vector is is 4, the output feature size is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a hardware accelerator of a convolutional neural network based on parallel multiplexing and a parallel multiplexing method. The hardware accelerator comprises a parameter storage module, an REG-FIFO module, a counting control module, an input multiplexing convolution operation module, an activation module and a pooling layer module. Wherein the parameter storage module is responsible for pre-storing picture parameters and trained weight parameters; the REG-FIFO module is responsible for generating an input matrix matched with the convolution kernel and reading matrix data; the counting control module is responsible for clock cycle counting and controls input and output of the REG-FIFO module according to the clock cycle counting. The input multiplexing convolution operation module is responsible for convolution operation of a convolution layer and a full connection layer; the activation module is responsible for output activation operation of the convolution layer and the full connection layer; and the pooling layer module is responsible for pooling operation output by the activated convolutional layer. The invention aims to realize convolutional neural network calculation with high operation parallelism, high data multiplexing and low hardware complexity.

Description

technical field [0001] The invention belongs to the field of artificial intelligence hardware design, and in particular relates to a method for implementing a convolutional neural network parallel multiplexing network computing accelerator. Background technique [0002] Convolutional Neural Network (CNN) is a popular direction in the field of Deep Learning (DL) in recent years. However, with the increasing data volume of the CNN model and the continuous improvement of computational input and accuracy requirements, the accelerated processing of convolutional neural networks has become a challenge. [0003] In general, in order to speed up the CNN algorithm, the following two characteristics are often used: (1) Sparse connection: The connection of the internal neurons in the calculation process is changed to a form of non-full connection. From the perspective of CNN visualization, it is the output feature. The corresponding node of a certain response in the graph is no longer...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/063G06N3/04G06N3/08

CPCG06N3/063G06N3/08G06N3/045Y02D10/00

Inventor 杜嘉程林木森王琦崔丰麒黄楚盛王超贾忱皓李明轩吴共庆杜高明孙晓胡竟为卢帅勇

Owner HEFEI UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Hardware accelerator of convolutional neural network based on parallel multiplexing and parallel multiplexing method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology