Software and hardware cooperative acceleration method and system and computer readable storage medium
A technology for software and hardware collaboration and system acceleration, applied in the field of deep learning, can solve problems such as inability to improve performance, high chip cost, and high cost, and achieve the effects of reducing on-chip and off-chip memory access, high throughput, and improving accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0087] Such as figure 1 As shown, this embodiment provides a software-hardware collaborative acceleration method based on a convolutional neural network, including the following steps:
[0088] The upper computer performs network analysis: for different network types, the model is analyzed into a unified data structure divided by layer, and the network platform is added to the structure header of the data structure, and the layer serial number is added to each layer of data structure and the layer association is established. According to the layer serial number With the layer name, associate the input layer and output layer of the current layer;
[0089] Quantization: quantify the weight and data layer by layer according to the layer number;
[0090] Hardware parameter calculation: Calculate the number of cut pieces after the feature data is divided according to the internal storage N*N;
[0091] Database firmware generation: Merge the data structures of each layer, merge t...
Embodiment 2
[0138] This embodiment provides a software-hardware collaborative acceleration system. The software-hardware collaborative acceleration system is based on a convolutional neural network to implement the method described in Embodiment 1, including an upper computer subsystem and a lower computer subsystem.
[0139] The upper computer subsystem includes:
[0140] The acquisition module is used to acquire the network model and its parameters;
[0141] A firmware generation module is used to generate database firmware based on the network model and its parameters,
[0142] Including: layer merging unit: according to the features supported by the hardware and the rules of software optimization, the upper and lower associated layers are merged to reduce the execution steps of software or hardware;
[0143] Resource calculation unit: calculate the resource consumption of the input layer and output layer of the network, the input resource is L*L*I* bit width, the output resource is O...
Embodiment 3
[0163] This embodiment provides a computer-readable storage medium. When the computer program is executed, the method described in any one of the foregoing embodiments can be implemented. Wherein, any references to memory, storage, database or other media used in the various embodiments provided in the present application may include non-volatile and / or volatile memory. Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM) and memory bus dynamic...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com