Deep learning accelerator suitable for stacked hourglass network

A deep learning and stacking technology, applied in the field of neural network training, can solve the problems of taking up hardware running time, reducing battery life, and not being able to accelerate, so as to achieve the effects of improving efficiency, improving utilization, and reducing delay

Active Publication Date: 2019-07-09
SUN YAT SEN UNIV
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The stacked hourglass network structure uses a large number of depth-separable convolution modules and multi-level residual structures. During the calculation process, these calculation layers require a large number of calculation units to access the memory to obtain the data required for calculation, and the delay generated during the memory access process It will take up most of the hardware running time. In the past, deep neural network accelerators did not provide optimized computing circuits for the memory access methods of the above-mentioned network structure, so they cannot provide effective acceleration for this structure.
At the same time, additional memory access due to unoptimized circuit design will also bring additional power consumption, which greatly reduces the battery life of devices with this type of accelerator unit.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep learning accelerator suitable for stacked hourglass network
  • Deep learning accelerator suitable for stacked hourglass network
  • Deep learning accelerator suitable for stacked hourglass network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The preferred embodiments of the present invention will be described below in conjunction with the accompanying drawings. It should be understood that the preferred embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0047] Such as figure 1 As shown, this embodiment discloses a deep learning accelerator suitable for a stacked hourglass network, including a control module 1, a data calculation module 2 and a data cache module 3;

[0048] The control module 1 is connected to the main control processor, and is used to receive the control signal input by the main control processor, and control the data calculation module 2 and the data cache module 3 according to the control signal;

[0049] Specifically, such as figure 2 As shown, the data calculation module 2 includes a plurality of layer calculation units 21; the layer calculation units 21 are used to perform data processing op...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a deep learning accelerator suitable for a stacked hourglass network. A parallel computing layer computing unit improves the computing parallelism, and a data caching module improves the utilization rate of data loaded into the accelerator while accelerating the computing speed; and meanwhile, a data adjuster in the accelerator can carry out self-adaptive data arrangement sequence change according to different calculation layer operations, so that the integrity of acquired data can be improved, the data acquisition efficiency is improved, and the time delay of a memoryaccess process is reduced. Therefore, according to the accelerator, the memory bandwidth is effectively reduced by reducing the number of memory accesses and improving the memory access efficiency while the algorithm calculation speed is increased, so that the overall calculation acceleration performance of the accelerator is realized.

Description

technical field [0001] The invention belongs to the field of neural network training, and in particular relates to a deep learning accelerator suitable for a stacked hourglass network. Background technique [0002] Deep Neural Networks (Deep Neural Networks) is an algorithm model in deep learning. Due to its superior performance compared with traditional algorithms, it has been widely used in various fields such as image classification, target recognition, and gesture recognition. Deep neural networks require a large amount of data calculations. Traditional general-purpose processors have slow calculation speeds due to architectural limitations and cannot meet the needs of real-time applications. Therefore, it is necessary to design dedicated neural network accelerators to provide hardware support for real-time calculations of deep neural networks. . [0003] In the application of gesture recognition, a deep neural network structure called stacked hourglass network (Stacked...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/063G06N3/08G06N3/04
CPCG06N3/063G06N3/08G06N3/045
Inventor 栗涛陈弟虎梁东宝萧嘉乐叶灵昶
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products