A software-hardware collaborative acceleration method, system, and computer-readable storage medium

A technology for software and hardware collaboration and system acceleration, applied in the field of deep learning, can solve problems such as inability to improve performance, high chip cost, and high cost, and achieve the effects of reducing on-chip and off-chip memory access, high throughput, and improving accuracy

Active Publication Date: 2020-07-10
浙江芯劢微电子股份有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the field of security, face detection in access control systems, license plate recognition in parking plants, etc., only need low-end low-cost chips, which can meet normal needs. However, some current acceleration chips are expensive, and the performance of specific scenarios cannot be improved. , it is necessary to customize the algorithm for the scene, customize a dedicated neural network acceleration chip to improve performance and accuracy, and reduce costs
[0005] At present, the convolutional neural network landing chip in the security market is processed by software on the embedded chip, and the entire convolution process is completely implemented by software. The cost of the chip using hardware processing is too high, but in the existing technology, for Convolutional neural network processing, and there is no complete set of solutions to describe the collaborative acceleration of software and hardware on embedded devices

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A software-hardware collaborative acceleration method, system, and computer-readable storage medium
  • A software-hardware collaborative acceleration method, system, and computer-readable storage medium
  • A software-hardware collaborative acceleration method, system, and computer-readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0087] Such as figure 1 As shown, this embodiment provides a software-hardware collaborative acceleration method based on a convolutional neural network, including the following steps:

[0088] The upper computer performs network analysis: for different network types, the model is analyzed into a unified data structure divided by layer, and the network platform is added to the structure header of the data structure, and the layer serial number is added to each layer of data structure and the layer association is established. According to the layer serial number With the layer name, associate the input layer and output layer of the current layer;

[0089] Quantization: quantify the weight and data layer by layer according to the layer number;

[0090] Hardware parameter calculation: Calculate the number of cut pieces after the feature data is divided according to the internal storage N*N;

[0091] Database firmware generation: Merge the data structures of each layer, merge t...

Embodiment 2

[0138] This embodiment provides a software-hardware collaborative acceleration system. The software-hardware collaborative acceleration system is based on a convolutional neural network to implement the method described in Embodiment 1, including an upper computer subsystem and a lower computer subsystem.

[0139] The upper computer subsystem includes:

[0140] The acquisition module is used to acquire the network model and its parameters;

[0141] A firmware generation module is used to generate database firmware based on the network model and its parameters,

[0142] Including: layer merging unit: according to the characteristics supported by the hardware and the rules of software optimization, the upper and lower associated layers are merged to reduce the execution steps of software or hardware;

[0143] Resource calculation unit: calculate the resource consumption of the input layer and output layer of the network, the input resource is L*L*I* bit width, the output resour...

Embodiment 3

[0163] This embodiment provides a computer-readable storage medium. When the computer program is executed, the method described in any one of the foregoing embodiments can be implemented. Wherein, any references to memory, storage, database or other media used in the various embodiments provided in the present application may include non-volatile and / or volatile memory. Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM) and memory bus dynamic...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention disclosed a software and hardware collaborative acceleration method, system and computer readable storage medium, involving deep learning technology fields, based on convolutional neural networks, including the following steps: the upper machine acquisition network model and its parameters and their parameters based on their generation database firmware.; The lower machine obtains and parses the database firmware, and starts the hardware acceleration module and / or software acceleration module for acceleration.The method provided by the present invention realizes the coordinated operation of the software and hardware of a convolutional neural network on the embedded device.

Description

technical field [0001] The present invention relates to the technical field of deep learning, in particular to a software-hardware collaborative acceleration method, system and computer-readable storage medium. Background technique [0002] The current security field has massive data, which can provide enough scenarios for deep learning training. The development of intelligent algorithms based on convolutional neural networks relies on massive data, and has made important breakthroughs in speech recognition and vision, showing faster iterations, the accuracy of the algorithm has exceeded the recognition accuracy of humans. [0003] With the landing of artificial intelligence in the field of security, there is an urgent need for processing chips with powerful computing power. The current big data and algorithms need to be tested in embedded chips. Shine on. [0004] In the field of security, face detection in access control systems, license plate recognition in parking plan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/063G06N3/10G06N3/04G06F15/78
CPCG06N3/063G06N3/10G06F15/781G06F15/7839G06N3/048G06N3/045
Inventor 吴春选
Owner 浙江芯劢微电子股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products