Sparse convolutional neural network acceleration method and system based on data flow architecture

A convolutional neural network and acceleration system technology, applied in the field of computer system structure, can solve the problems of reducing continuous access, performance degradation, etc., and achieve the effect of reducing continuous access

Active Publication Date: 2020-12-01
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Aiming at the problem of performance degradation caused by sparse network instruction loading in the coarse-grained data flow architecture, by analyzing the data and instruction characteristics of the sparse network, a method and device for accelerating the sparse network are invented, which reduces the continuous access to the storage unit when the instruction is loaded , to achieve the acceleration of sparse networks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sparse convolutional neural network acceleration method and system based on data flow architecture
  • Sparse convolutional neural network acceleration method and system based on data flow architecture
  • Sparse convolutional neural network acceleration method and system based on data flow architecture

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] For the problem of multiple loading of instructions. Dense convolution performs multiplication and addition operations. For convolution operations of different channels, it obtains data at the same position in different channels to perform multiplication and addition operations. Based on the operation characteristics of this rule, when the operations of different channels of the convolution layer are compiled by the compiler After being mapped to the computing array (PE array), the instructions of different channels are the same, so the PE array only needs to load the instruction from the memory once to execute the operations of all channels, ensuring the full utilization of computing resources.

[0038] However, for sparse convolution, the pruning operation sets some weights to 0. Since 0 multiplied by any number is still 0, in order to eliminate the operation of 0 values, after the pruning operation is performed, the corresponding instruction is removed, so that the co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a sparse convolutional neural network acceleration method and system based on a data flow architecture. For sparse convolution application, a set of instruction sharing detection device and a sparse convolution acceleration method are designed in a software mode. The instructions generated by the compiler are detected and compared, the instructions with the completely same instruction content in all the instructions are marked, the addresses of the instructions are set to be the same addresses, instruction sharing in sparse convolution is achieved, therefore, access of instruction loading to a memory is reduced, and the sparse convolution running time is prolonged.

Description

technical field [0001] The invention relates to the direction of computer system structure, and in particular to a method and system for accelerating a sparse convolution layer in a coarse-grained data flow architecture. Background technique [0002] The neural network has advanced performance in image detection, speech recognition, and natural language processing. As the application becomes more complex, the neural network model also becomes more complex, which poses many challenges to traditional hardware. In order to alleviate the pressure on hardware resources, sparse network It has good advantages in terms of computing, storage, and power consumption requirements. There have been many algorithms and accelerators for accelerating sparse networks, such as the sparse-blas library for CPUs, the cusparse library for GPUs, etc., which accelerate the execution of sparse networks to a certain extent. Consumption and other aspects have advanced expressiveness. [0003] The coa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/30G06F9/32G06N3/10G06N3/08G06N3/04
CPCG06F9/30145G06F9/32G06N3/10G06N3/082G06N3/045Y02D10/00
Inventor 吴欣欣范志华轩伟李文明叶笑春范东睿
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products