Supercharge Your Innovation With Domain-Expert AI Agents!

A device and method for executing LSTM operations

A technology of computing modules and computing results, applied in the field of artificial neural networks, can solve problems such as high power consumption, no multi-layer artificial neural network computing, off-chip bandwidth performance bottlenecks, etc.

Active Publication Date: 2020-03-27
CAMBRICON TECH CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the GPU is a device specially used to perform graphics and image calculations and scientific calculations, without special support for multi-layer artificial neural network operations, it still requires a lot of front-end decoding work to perform multi-layer artificial neural network operations, which brings a lot of problems. additional cost
In addition, the GPU has only a small on-chip cache, and the model data (weights) of the cyclic neural network and LSTM need to be repeatedly moved from off-chip, and the off-chip bandwidth has become the main performance bottleneck.
In addition, the GPU has only a small on-chip cache, and the model data (weights) of the cyclic neural network and LSTM need to be repeatedly moved from off-chip. The off-chip bandwidth has become the main performance bottleneck, and at the same time it has brought huge power consumption overhead.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A device and method for executing LSTM operations
  • A device and method for executing LSTM operations
  • A device and method for executing LSTM operations

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] figure 1 A schematic diagram of the overall structure of the device for performing recurrent neural network and LSTM operations according to the embodiment of the present invention is shown. Such as figure 1 As shown, the device includes an instruction storage unit 1 , a controller unit 2 , a data access unit 3 , an interconnection module 4 , a master computing module 5 and a plurality of slave computing modules 6 . The instruction storage unit 1, the controller unit 2, the data access unit 3, the interconnection module 4, the main operation module 5 and the slave operation module 6 can all be connected through hardware circuits (including but not limited to FPGA, CGRA, application specific integrated circuit ASIC, analog circuit and memristors).

[0019] The instruction storage unit 1 reads instructions through the data access unit 3 and caches the read instructions. The instruction storage unit 1 can be realized by various storage devices (SRAM, DRAM, eDRAM, memris...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a device for executing a recurrent neural network and LSTM. The device comprises an instruction storage unit, a controller unit, a data access unit, an interconnection module, amaster operation module and a plurality of slave operation modules. The slave operation module is used for multiplying and adding input data to obtain a partial sum and storing the partial sum untilneuron data are all input, and returning a result to the master operation module; and the main operation module is used for performing interpolation activation on the sum returned from the operation module in a forward process, and performing interpolation in a reverse process to obtain an activation derivative and multiplying the activation derivative by the gradient. According to the method, theproblems of insufficient CPU and GPU operation performance and high front-end decoding overhead can be solved, and the support for forward operation of the multi-layer artificial neural network is effectively improved.

Description

technical field [0001] The technical field of artificial neural network of the present invention relates to LSTM in particular, especially a device and method for executing LSTM. Background technique [0002] Recurrent neural networks and LSTMs are widely used in speech recognition, language modeling, translation, picture description and other fields. In recent years, due to their high recognition accuracy and good parallelism, they have become more and more popular in academia and industry. Widespread concern. [0003] One known way to support recurrent neural networks and LSTMs is to use general purpose processors. The method supports the above-mentioned algorithms by executing general-purpose instructions using general-purpose register files and general-purpose functional units. One of the disadvantages of this method is that the computing performance of a single general-purpose processor is low, which cannot meet the performance requirements of the usual cyclic neural ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/063G06F9/30
CPCG06F9/30007G06N3/063Y02D10/00
Inventor 郭崎陈峋宇陈云霁陈天石
Owner CAMBRICON TECH CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More