Software and hardware cooperative acceleration method and system and computer readable storage medium

A technology for software and hardware collaboration and system acceleration, applied in the field of deep learning, can solve problems such as inability to improve performance, high chip cost, and high cost, and achieve the effects of reducing on-chip and off-chip memory access, high throughput, and improving accuracy

Active Publication Date: 2020-05-19
杭州雄迈集成电路技术股份有限公司
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the field of security, face detection in access control systems, license plate recognition in parking plants, etc., only need low-end low-cost chips, which can meet normal needs. However, some current acceleration chips are expensive, and the performance of specific scenarios cannot be improved. , it is necessary to customize the algorithm for the scene, customize a dedicated neural network acceleration chip to improve performance and accuracy, and reduce costs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Software and hardware cooperative acceleration method and system and computer readable storage medium
  • Software and hardware cooperative acceleration method and system and computer readable storage medium
  • Software and hardware cooperative acceleration method and system and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0087] Such as figure 1 As shown, this embodiment provides a software-hardware collaborative acceleration method based on a convolutional neural network, including the following steps:

[0088] The upper computer performs network analysis: for different network types, the model is analyzed into a unified data structure divided by layer, and the network platform is added to the structure header of the data structure, and the layer serial number is added to each layer of data structure and the layer association is established. According to the layer serial number With the layer name, associate the input layer and output layer of the current layer;

[0089] Quantization: quantify the weight and data layer by layer according to the layer number;

[0090] Hardware parameter calculation: Calculate the number of cut pieces after the feature data is divided according to the internal storage N*N;

[0091] Database firmware generation: Merge the data structures of each layer, merge t...

Embodiment 2

[0138] This embodiment provides a software-hardware collaborative acceleration system. The software-hardware collaborative acceleration system is based on a convolutional neural network to implement the method described in Embodiment 1, including an upper computer subsystem and a lower computer subsystem.

[0139] The upper computer subsystem includes:

[0140] The acquisition module is used to acquire the network model and its parameters;

[0141] A firmware generation module is used to generate database firmware based on the network model and its parameters,

[0142] Including: layer merging unit: according to the features supported by the hardware and the rules of software optimization, the upper and lower associated layers are merged to reduce the execution steps of software or hardware;

[0143] Resource calculation unit: calculate the resource consumption of the input layer and output layer of the network, the input resource is L*L*I* bit width, the output resource is O...

Embodiment 3

[0163] This embodiment provides a computer-readable storage medium. When the computer program is executed, the method described in any one of the foregoing embodiments can be implemented. Wherein, any references to memory, storage, database or other media used in the various embodiments provided in the present application may include non-volatile and / or volatile memory. Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM) and memory bus dynamic...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a software and hardware cooperative acceleration method and system, and a computer readable storage medium, relates to the technical field of deep learning, and is based on a convolutional neural network, and the method comprises the following steps: an upper computer obtains a network model and parameters thereof, and generates database firmware according to the network model and the parameters; and a lower computer acquires and analyzes the database firmware, and starts the hardware acceleration module and/or the software acceleration module for acceleration. According to the method provided by the invention, a software and hardware cooperative operation process of the convolutional neural network on the embedded equipment is realized.

Description

technical field [0001] The present invention relates to the technical field of deep learning, in particular to a software-hardware collaborative acceleration method, system and computer-readable storage medium. Background technique [0002] The current security field has massive data, which can provide enough scenarios for deep learning training. The development of intelligent algorithms based on convolutional neural networks relies on massive data, and has made important breakthroughs in speech recognition and vision, showing faster iterations, the accuracy of the algorithm has exceeded the recognition accuracy of humans. [0003] With the landing of artificial intelligence in the field of security, there is an urgent need for processing chips with powerful computing power. The current big data and algorithms need to be tested in embedded chips. Shine on. [0004] In the field of security, face detection in access control systems, license plate recognition in parking plan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/063G06N3/10G06N3/04G06F15/78
CPCG06N3/063G06N3/10G06F15/781G06F15/7839G06N3/048G06N3/045
Inventor 吴春选
Owner 杭州雄迈集成电路技术股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products