An acceleration method for realizing sparse convolutional neural network inference for hardware

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and hardware implementation technology, applied in the fields of electronic information and deep learning, can solve problems such as load imbalance, internal buffer misalignment, inefficiency of accelerator architecture, etc., to reduce logic complexity and improve overall efficiency Effect

Active Publication Date: 2019-05-03

SOUTHEAST UNIV +2

View PDF4 Cites 27 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Many pruning algorithms have been proposed, but they mainly focus on the number of weights to be pruned, and rarely consider the complexity of deploying the pruned overall network on the ASIC or FPGA accelerator architecture

When the pruned network runs on a hardware accelerator platform, problems such as internal buffer misalignment and load imbalance will occur, making the entire accelerator architecture inefficient

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] The technical solutions and beneficial effects of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0030] The present invention provides an acceleration method for realizing sparse convolutional neural network inference for hardware, including a method for determining group pruning parameters for sparse hardware acceleration architecture, a method for group pruning training for sparse hardware acceleration architecture, and a method for sparse volume A Deployment Method for Forward Inference of Productive Neural Networks.

[0031] Such as figure 1 Shown is a schematic diagram of the implementation of the group pruning scheme proposed by the present invention in the channel direction of the convolutional layer. Here, the number of activation values N is obtained in batches m =16, packet length g=8, compression rate Δ=0.25 as an example to specifically illustrate the working method of the present invention.

[...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an acceleration method for realizing sparse convolutional neural network inference for hardware. The method comprises a grouping pruning parameter determination method facing asparse hardware acceleration architecture, a grouping pruning training method for sparse hardware acceleration architecture and a deployment method for forward inference of a sparse convolutional neural network. determining the packet length and the pruning rate of packet pruning according to the number of multipliers in the hardware architecture; based on the magnitude cutting mode, cutting weights except the compression rate; The network accuracy and compression rate after pruning are improved through an incremental training mode, the weight and index parameters of a non-pruning position are saved after the pruned network is finely adjusted, the network is sent to a computing unit under a hardware architecture, and the computing unit obtains the activation value of the packet length atthe same time to complete sparse network forward inference. The pruning parameters and the pruning strategy of the algorithm level are set based on the hardware architecture, the logic complexity of the sparse accelerator is reduced, and the overall efficiency of forward inference of the sparse accelerator is improved.

Description

technical field [0001] The invention belongs to the technical field of electronic information and deep learning, and in particular relates to an acceleration method for realizing sparse convolutional neural network inference for hardware. Background technique [0002] A neural network model is a mathematical expression of a biological neural network learning system. In recent years, with the strengthening of computing power and the generation of large-scale data sets, neural network models have been increasingly used in machine vision fields such as image classification and object detection. [0003] However, from the perspective of using the neural network model to solve problems, people often tend to design a deeper and larger convolutional neural network (CNN for short) to collect more data in order to obtain better results. However, as the complexity of the model increases, the number of model parameters increases, the scale of the model and the floating point numbers r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/04

CPCY02D10/00

Inventor 陆生礼庞伟吴成路范雪梅舒程昊梁彪

Owner SOUTHEAST UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

An acceleration method for realizing sparse convolutional neural network inference for hardware

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology