Apparatus and method for realizing accelerator of sparse convolutional neural network

A convolutional neural network and neural network technology, which is applied in the field of devices for implementing sparse convolutional neural network accelerators, can solve problems such as reducing the amount of calculation, and achieve a good performance-to-power ratio effect

Inactive Publication Date: 2017-10-10
XILINX INC
View PDF0 Cites 149 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the fully connected layer still has a large number of parameters. If the fully connected layer is sparsely processed, the amount of calculation will be greatly reduced.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Apparatus and method for realizing accelerator of sparse convolutional neural network
  • Apparatus and method for realizing accelerator of sparse convolutional neural network
  • Apparatus and method for realizing accelerator of sparse convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] Specific embodiments of the present invention will be explained in detail below in conjunction with the accompanying drawings.

[0046] image 3 is a schematic diagram of an apparatus for implementing a sparse convolutional neural network accelerator according to the present invention.

[0047] The invention provides a device for realizing a sparse convolutional neural network accelerator. Such as image 3 As shown, the device mainly includes three major modules: convolution and pooling unit, fully connected unit, and control unit. Specifically, the convolution and pooling unit, also known as the Convolution+Pooling module, is used to perform convolution and pooling operations on the input data for the first iteration number according to the convolution parameter information, so as to finally obtain the sparse neural network The input vector, wherein each input data is divided into multiple sub-blocks, and the convolution and pooling unit performs convolution and poo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an apparatus and method for realizing an accelerator of a sparse convolutional neural network. According to the invention, the apparatus herein includes a convolutional and pooling unit, a full connection unit and a control unit. The method includes the following steps: on the basis of control information, reading convolutional parameter information, and input data and intermediate computing data, and reading full connected layer weight matrix position information, in accordance with the convolutional parameter information, conducting convolution and pooling on the input data with first iteration times, then on the basis of the full connected layer weight matrix position information, conducting full connection computing with second iteration times. Each input data is divided into a plurality of sub-blocks, and the convolutional and pooling unit and the full connection unit separately operate on the plurality of sub-blocks in parallel. According to the invention, the apparatus herein uses a specific circuit, supports a full connected layer sparse convolutional neural network, uses parallel ping-pang buffer design and assembly line design, effectively balances I / O broadband and computing efficiency, and acquires better performance power consumption ratio.

Description

technical field [0001] The present invention relates to artificial neural networks, and more particularly to devices and methods for implementing sparse convolutional neural network accelerators. Background technique [0002] Artificial Neural Networks (ANN), also referred to as Neural Network (NN), is an algorithmic mathematical model that imitates the behavioral characteristics of animal neural networks and performs distributed parallel information processing. In recent years, neural networks have developed rapidly and are widely used in many fields, including image recognition, speech recognition, natural language processing, weather forecast, gene expression, content push and so on. [0003] figure 1 Diagram illustrating the computational schematic of a neuron in an artificial neural network. [0004] The accumulated stimulus of a neuron is the sum of the stimulus delivered by other neurons and the corresponding weight. Xj is used to represent the accumulation of the j...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/063G06N3/08
CPCG06N3/063G06N3/08G06N3/045G06F2207/4824G06F7/5443G06F7/57
Inventor 谢东亮张玉单羿
Owner XILINX INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products