Hardware acceleration implementation architecture for backward training of convolutional neural network based on FPGA

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and convolutional layer technology, applied in the field of deep learning, can solve problems such as insufficient storage bandwidth, and achieve the effect of significant acceleration effect, regular structure, and improved storage bandwidth.

Active Publication Date: 2019-12-06

UNIV OF ELECTRONICS SCI & TECH OF CHINA

View PDF8 Cites 11 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Therefore, FPGA is a very good choice for deep learning acceleration. However, there are not many studies on the specific structure of FPGA implementation of deep learning algorithms. There are problems such as insufficient storage bandwidth, and there is still a lot of room for improvement in acceleration effects.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

example 1

[0068] 1. Example 1: FPGA simulation and implementation of convolutional neural network Hcnn backward training process

[0069] The simulation platform used in Example 1 is Matlab R2017b, ISE 14.7 and Modelsim 10.1a, and the implemented architecture is as follows figure 2 shown. First in Matlab R2017b for figure 2 The Hcnn convolutional neural network is verified by matlab fixed-point simulation, and the accuracy of the model can reach 95.34%. Then in ISE 14.7 and Modelsim 10.1a, the simulation verification and implementation of the hardware architecture are carried out. In matlab fixed-point simulation and FPGA implementation, most parameters and intermediate register variables adopt the fixed-point method of fi(1,18,12), that is, 1 bit of sign, 5 bits of integer, and 12 bits of decimal.

[0070] The Modelsim simulation results of the forward prediction process of the Hcnn convolutional neural network in Example 1 are as follows Figure 9 shown. It can be seen from the...

example 2

[0073] The simulation platform used in Example 2 is ISE 14.7 and PyCharm. first by Figure 9 It can be seen that the data processing time of the model is 821 clks (excluding the time to read input data), so at a clock frequency of 200M, the time used is 4105ns.

[0074] Then, on the simulation platform PyCharm, use the CPU model of Intel E3-1230V2@3.30GHz and the GPU model of TitanX to complete the calculation of the structure in Example 1, and the calculation time of the CPU and GPU to process a sample is 7330ns and 405ns.

[0075] The speed and power consumption performance analysis comparison diagram of the structure of Example 1 implemented in FPGA, CPU, and GPU is as follows Figure 11 shown. As can be seen from the figure, in terms of speed, the convolutional neural network FPGA implementation architecture of the present invention has an improvement of about three times compared with the CPU; and there is still a certain gap compared with the GPU, which is limited by ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a hardware acceleration implementation architecture for backward training of a convolutional neural network based on an FPGA. Based on a basic processing module of backward training of each layer of a convolutional neural network, the hardware acceleration implementation architecture comprehensively considers operation processing time and resource consumption, and realizesa backward training process of the Hcnn convolutional neural network in a parallel pipeline form by using methods of parallel-serial conversion, data fragmentation, pipeline design resource reuse andthe like according to the principle of achieving the maximum degree of parallelism and the minimum amount of resource consumption as much as possible. According to the hardware acceleration implementation architecture, the characteristics of data parallelism and assembly line parallelism of the FPGA are fully utilized, and the implementation is simple, and the structure is more regular, and the wiring is more consistent, and the frequency is also greatly improved, and the acceleration effect is remarkable. More importantly, the hardware acceleration implementation architecture uses an optimized pulsation array structure to balance IO read-write and calculation, improves the throughput rate under the condition that less storage bandwidth is consumed, and effectively solves the problem thatthe data access speed is much higher than that of implementation of a convolutional neural network FPGA with the data processing speed.

Description

technical field [0001] The present invention relates to the field of deep learning, one of the important development directions in artificial intelligence, and specifically relates to a hardware acceleration implementation architecture for backward training of convolutional neural networks based on FPGA. Background technique [0002] In recent years, the field of artificial intelligence, especially machine learning, has achieved breakthrough achievements in both theory and application. Deep learning is one of the most important development directions of machine learning. Deep learning can learn features with multi-level abstraction. Therefore, deep learning has excellent performance in solving complex and abstract learning problems. However, as the problem continues to become more complex and abstract, the model of the deep learning network becomes more complex, and the learning time of the model also increases. For example, Google's "AlphaGo" uses a multi-layer neural netw...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/063G06N3/04

CPCG06N3/063G06N3/045

Inventor 黄圳何春李玉柏王坚

Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Hardware acceleration implementation architecture for backward training of convolutional neural network based on FPGA

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

example 1

example 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology