A Deep Learning Accelerator for Stacked Hourglass Networks

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep learning and stacking technology, applied in the field of neural network training, can solve the problems of occupying hardware running time, reducing battery life, and not being able to accelerate, so as to improve efficiency, improve utilization, and reduce delay.

Active Publication Date: 2021-04-13

SUN YAT SEN UNIV

View PDF4 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The stacked hourglass network structure uses a large number of depth-separable convolution modules and multi-level residual structures. During the calculation process, these calculation layers require a large number of calculation units to access the memory to obtain the data required for calculation, and the delay generated during the memory access process It will take up most of the hardware running time. In the past, deep neural network accelerators did not provide optimized computing circuits for the memory access methods of the above-mentioned network structure, so they cannot provide effective acceleration for this structure.

At the same time, additional memory access due to unoptimized circuit design will also bring additional power consumption, which greatly reduces the battery life of devices with this type of accelerator unit.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0046] The preferred embodiments of the present invention will be described below in conjunction with the accompanying drawings. It should be understood that the preferred embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0047] Such as figure 1 As shown, this embodiment discloses a deep learning accelerator suitable for a stacked hourglass network, including a control module 1, a data calculation module 2 and a data cache module 3;

[0048] The control module 1 is connected to the main control processor, and is used to receive the control signal input by the main control processor, and control the data calculation module 2 and the data cache module 3 according to the control signal;

[0049] Specifically, such as figure 2 As shown, the data calculation module 2 includes a plurality of layer calculation units 21; the layer calculation units 21 are used to perform data processing op...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a deep learning accelerator suitable for a stacked hourglass network. The parallel calculation layer calculation unit improves the calculation parallelism, and the data cache module improves the utilization of data loaded into the accelerator internal cache while accelerating the calculation speed. At the same time, the data adjuster inside the accelerator can make adaptive changes in the order of data arrangement according to the operation of the computing layer, which can increase the integrity of the acquired data, improve the efficiency of data acquisition, and reduce the delay of the memory access process. Therefore, while improving the computing speed of the algorithm, the accelerator effectively reduces the memory bandwidth by reducing the number of memory accesses and improving memory access efficiency, thereby realizing the overall computing acceleration performance of the accelerator.

Description

technical field [0001] The invention belongs to the field of neural network training, and in particular relates to a deep learning accelerator suitable for a stacked hourglass network. Background technique [0002] Deep Neural Networks (Deep Neural Networks) is an algorithm model in deep learning. Due to its superior performance compared with traditional algorithms, it has been widely used in various fields such as image classification, target recognition, and gesture recognition. Deep neural networks require a large amount of data calculations. Traditional general-purpose processors have slow calculation speeds due to architectural limitations and cannot meet the needs of real-time applications. Therefore, it is necessary to design dedicated neural network accelerators to provide hardware support for real-time calculations of deep neural networks. . [0003] In the application of gesture recognition, a deep neural network structure called stacked hourglass network (Stacked...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06N3/063G06N3/08G06N3/04

CPCG06N3/063G06N3/08G06N3/045

Inventor 栗涛陈弟虎梁东宝萧嘉乐叶灵昶

Owner SUN YAT SEN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A Deep Learning Accelerator for Stacked Hourglass Networks

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology