Convolutional neural network hardware acceleration device, convolution calculation method, and storage medium

A convolutional neural network and hardware acceleration technology, applied in the computer field, can solve problems such as insufficient parallel computing, insufficient pertinence of effects, and complex instructions

Inactive Publication Date: 2018-06-22
NATIONZ TECH INC
View PDF4 Cites 81 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The purpose of the present invention is to propose a convolutional neural network hardware acceleration device controlled by an instruction set and a convolution calculation method to solve the problems of insufficient parallel computing, insufficient on-chip storage, complex instructions, and insufficient pertinence of effects in the prior art.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Convolutional neural network hardware acceleration device, convolution calculation method, and storage medium
  • Convolutional neural network hardware acceleration device, convolution calculation method, and storage medium
  • Convolutional neural network hardware acceleration device, convolution calculation method, and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0149] Using the invention for convolution calculation, the invention provides the following embodiments.

[0150] The convolutional neural network hardware acceleration device, convolution calculation method, and storage medium used in this embodiment are as described in the specific implementation manner. No longer.

[0151] Figure 5 It is a structural schematic diagram of the parallel multiplier / multiply accumulator of the embodiment. Such as Figure 5 As shown, the instruction processing unit 1 controls the entire data operation unit 22 to perform parallel multiplication / multiplication-accumulation calculations through instruction decoding results, then each parallel multiplier / multiplication-accumulator 2211 can realize multiplication-accumulation or multiplication operations through instruction configuration. The temporary register reg is used to store the temporary value after the multiply-accumulate or multiply operation. The input ports of the parallel multiplier...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a convolutional neural network hardware acceleration device, a convolution calculation method, and a storage medium. The device comprises an instruction processing unit, a hardware acceleration module and an external data memory unit, wherein the instruction processing unit decodes an instruction set to execute a corresponding operation so as to control the hardware acceleration module; the hardware acceleration module comprises an input caching unit, a data operation unit and an output caching unit, wherein the input caching unit executes the memory access operation of the instruction processing unit, and stores data read from the external data memory unit; the data operation unit executes the operation execution operation of the instruction processing unit, processes the data operation of the convolutional neural network, and controls a data operation process and a data flow direction according to an operation instruction set; the output caching unit executesthe memory access operation of the instruction processing unit, and stores a calculation result which is output by the data operation unit and needs to be written into the external data memory unit;and the external data memory unit stores a calculation result output by the output caching unit and transmits data to the input caching unit according to the reading of the input caching unit.

Description

technical field [0001] The invention belongs to the technical field of computers, and mainly relates to a convolutional neural network hardware acceleration device controlled by an instruction set and a convolution calculation method. Background technique [0002] In recent years, with the rise of the popularity of artificial intelligence, more and more deep learning algorithm models have been proposed to solve current research problems. However, as the research on deep learning algorithms continues to deepen, various models with more layers and more complex structures have been proposed, and there is still no relatively complete theoretical system, and the algorithm is still advancing. Therefore, the current terminal hardware implementation The problem of difficulty in changing the architecture and reprogramming of the dedicated deep learning algorithm model is inevitable. [0003] Among the current technologies, one is to use general-purpose processors to calculate deep l...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/063G06F9/30
CPCG06F9/30007G06N3/063G06F9/30145G06N3/045G06F9/30101G06F9/3802
Inventor 罗聪万文涛梁洁谢华
Owner NATIONZ TECH INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products