A method for implementing single-broadcast and multi-computing based on deep learning accelerator

A deep learning and accelerator technology, applied in neural learning methods, instruments, computing, etc., can solve the problems of deep learning computing energy consumption, high cost, inability to fully utilize data, and high reuse rate, so as to reduce control complexity and avoid The effect of repeated input and improved utilization

Active Publication Date: 2022-06-07
NAT UNIV OF DEFENSE TECH
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, at present, when deep learning accelerators calculate, they usually take the required input feature value once, discard the data after the calculation is completed, and take out the value again when the data is needed next time. Instead, in the deep learning algorithm, volume The reuse rate of the input eigenvalues ​​of the product operation is very high, and the cost of one-time fetching in the accelerator hardware is expensive. The above-mentioned method of discarding the data after each fetch is completed and fetching the value next time the data is needed It will cause a lot of waste of energy consumption, and the data in the calculation process cannot be fully utilized, making the energy consumption and cost of deep learning calculations still high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for implementing single-broadcast and multi-computing based on deep learning accelerator
  • A method for implementing single-broadcast and multi-computing based on deep learning accelerator
  • A method for implementing single-broadcast and multi-computing based on deep learning accelerator

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] The present invention will be further described below with reference to the accompanying drawings and specific preferred embodiments, but the protection scope of the present invention is not limited thereby.

[0030] like figure 1 As shown, the method for implementing single broadcast multiple operations based on a deep learning accelerator in this embodiment includes: configuring a plurality of intermediate value registers for storing intermediate results for a specified multiplier in a multiplier array of the accelerator; , when the input eigenvalue and the corresponding weight value need to be multiplied, the operation result of the input eigenvalue and the corresponding weight value is stored in the corresponding intermediate value register for use in the next calculation until the current output is completed Calculation of eigenvalues. In the convolution calculation process of deep learning, one input feature value needs to be multiplied with multiple input weight...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for realizing single broadcast and multiple operations based on a deep learning accelerator. The method includes: configuring a plurality of intermediate value registers for storing intermediate results for a specified multiplier in the multiplier array of the accelerator; performing deep learning calculations In the process, when it is necessary to multiply the input feature value and the corresponding weight value, the operation result of the input feature value and the corresponding weight value is stored in the corresponding intermediate value register for use in the next calculation until the output feature is completed value calculation. The invention has the advantages of simple implementation method, single broadcast and multiple calculations of deep learning accelerators, low cost, high data utilization rate, and low energy consumption.

Description

technical field [0001] The invention relates to the technical field of deep learning accelerators, in particular to a method for implementing single broadcast and multiple operations based on a deep learning accelerator. Background technique [0002] Deep neural networks (DNNs) are the foundation of artificial intelligence applications, including self-driving cars, cancer detection, computer vision, speech recognition and robotics, complex games and more. The accuracy of DNN in artificial intelligence tasks is very high, and it can even surpass the accuracy of human beings. The outstanding performance of DNN stems from its ability to use statistical learning methods to extract high-level features from raw sensory data and obtain input space in a large amount of data. Effective representation, but the complexity of deep learning is high. In deep learning, the number of layers of the neural network is very large, the current neural network can usually reach 5 to 1000 layers, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/04G06N3/08G06F7/523G06F7/50
CPCG06F7/50G06F7/523G06N3/08G06N3/045
Inventor 陈书明杨超李斌陈海燕扈啸张军阳陈伟文
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products