Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Full connection layer compression method and device, electronic equipment, accelerator and storage medium

A technology of fully connected layer and compression method, which is applied in the field of neural network to achieve the effect of improving real-time recognition and reducing storage overhead.

Pending Publication Date: 2022-02-18
EHIWAY MICROELECTRONIC SCI & TECH SUZHOU CO LTD +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At the same time, the data preparation time Tbatch of the batch operation is NumB times of the ordinary pipeline time Tpipe, so the batch operation will significantly increase the output delay (IRL, Input Response Latency)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Full connection layer compression method and device, electronic equipment, accelerator and storage medium
  • Full connection layer compression method and device, electronic equipment, accelerator and storage medium
  • Full connection layer compression method and device, electronic equipment, accelerator and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] Due to the excellent flexible programming and outstanding performance-to-power ratio of Field Programmable Gate Array (FPGA, Field Programmable Gate Array), current mainstream CNN forward inference accelerators mostly adopt FPGA-based acceleration solutions. The pipeline style architecture is a common accelerator overall architecture. This architecture can customize the computing structure of each layer to fully match the computing needs of this layer, so it is conducive to fully utilizing the computing potential of the chip. In order to ensure better pipeline performance, combined with the data supply and demand relationship and flow mode of each layer of CNN, the data between layers is usually used as follows: figure 1 The storage method of the ping-pong shown. Among them, part C represents the calculation part, and part S represents the ping-pong storage part. According to the Roofline theory, if the chip bandwidth cannot meet the needs of computing data, the FPGA'...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a full connection layer compression method and device, an accelerator, electronic equipment and a storage medium, and is applied to the technical field of neural networks. Comprising the steps of dividing all layers in a full connection layer into alternate first layers and second layers, compressing a group of first layers and second layers into a binding layer, where ping-pong storage does not exist between the first layers and the second layers; compressing two adjacent layers into one binding layer, so that the intermediate result storage overhead can be effectively reduced. In addition, as the full connection layer is compressed, the flow series of the full connection layer is also changed into half of the flow series before compression, so that the output delay is greatly shortened, and the recognition real-time performance of the accelerator is improved.

Description

technical field [0001] The present application relates to the technical field of neural networks, and in particular to a fully-connected layer compression method, device, accelerator, electronic equipment, and storage medium. Background technique [0002] Convolutional Neural Networks (CNN, Convolutional Neural Networks) is one of the representative algorithms of deep learning. Because of its excellent performance in the field of artificial intelligence, CNN has been widely concerned and applied to high-tech applications such as image classification, speech recognition, face recognition, automatic driving, and medical imaging. At present, excellent CNN structures such as AlexNet, VGG, and ResNet have emerged. With the continuous development of CNN, the network structure becomes increasingly complex and the number of parameters explodes, which brings challenges to the design of CNN hardware accelerators. [0003] The fully-connected layer structure of batch operations can r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06F17/16G06F7/523
CPCG06F17/16G06F7/523G06N3/045
Inventor 屈心媛黄志洪蔡刚方震
Owner EHIWAY MICROELECTRONIC SCI & TECH SUZHOU CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products